library
Class No.
Book No.
Accession No.
DELHI COLLEGE OF ENGINEERING
Kashmere Gate, Delhi
LIBRARY
DUE DATE
For each day'e delay after the due date a fine of 50 P.
per Vol. Shall be charged for books of General Section
and Be 1.00 for text books Section.
MATHEMATICS OF PHYSICS
AND MODERN ENGINEERING
MATHEMATICS
OF PHYSICS AND
MODERN ENGINEERING
I. S. Sokolnikoff
Professor of M athematics
University of California , Los Angeles
R. M. Redheffer
Associate Professor of Mathematics
University of California , Los Angeles
INTERNATIONAL SIXTH N7 L1HTION
McGRAW-HILL BOOK COMPANY, INC.
New York Toronto London
KOOAKUSHA COMPANY, LTD.
Tokyo
MA I'HhMA'1 1C;S OP PHYSICS ANI)
MODI RN ENGINEERING
lx n nx i / /ox il \ivm \ / / di nox
Exclusive lights h\ kdgakusha Co, l.td, jor umnujadun and
export from Japan I Ins hook i an not ht re exported from
the country to which it is consigned by A ogaUushn to , ltd
Copyright © 1958 by t he McGraw-llill Hook Compam, tin
All lights teseivcd Plus book, oi pails thereof, m.tv not be
reproduced in any lot in without permission ot the publishers
Libiary ot Congiess (Catalog Caul Number 57 12914
'IOSHO INSAISIJ NtlNriM, (.() , LTD. lOKYO, JAPAN
PREFACE
The rapidly decreasing time lag between scientific discoveries and appli-
cations imposes ever-increasing demands on the mathematical equipment
of scientists and engineers. Although the mathematical preparation of
engineering students has been strengthened materially in the past thirty
years, the introduction of courses beyond the tiaditional “terminal course”
in calculus has been largely confined to a few leading institutions. The
reluctance to broaden significantly the program of instruction in mathe-
matics can be attributed in part to the crowded engineering curricula, in
part to the failure to sense the central position of mathematics in sciences
and technology, and in part to the scarcity of suitable staffs and instruc-
tional media. The broadening, however, is inevitable, for it is now gen-
erally recognized that no professional engineer can keep abreast of scien-
tific developments without substantially extending his mathematical hori-
zons.
This book, in common with its predecessor written by the senior author
some twenty-five years ago, has as its main aim a sound extension of such
horizons. The authors not only have been guided by their subjective
appraisal of the live present-day needs of the engineering profession but
have also taken into account the views of the leaders of engineering
thought as expressed in numerous conferences and symposia on engineer-
ing education sponsored by the National Science Foundation, the American
Society of Engineering Education, and its predecessor the Society for the
Promotion of Engineering Education.
There are many conflicting and often prejudiced currents of thought as
to how mathematics should be presented to students of applied sciences.
Some believe that mathematics is one whole and indivisible and hence
should be presented unto all alike, regardless of the differing creeds. Others
are content with a catalogue of useful formulas, rules, and devices for
solving problems. The authors think that these two extremo viewpoints
are somewhat limited, since they recognize only two of the many facets of
mathematics. A preoccupation with the logic of mathematics and the over-
emphasis of a convention called rigor are among the best known means for
stifling interest in mathematics as a crutch to common sense. On the other
hand, a presentation which puts applications above the medium making
VI
PREFACE
applications possible is sterile, because it gives no inkling of the supreme
importance of generalizations and abstractions in applications. The au-
thors have tried to strike a balance which would make this book both a
sound and an inspiring introduction to applied mathematics.
The material in this book appears in nine chapters, each of which is
complete and virtually independent of the others. Occasional cross refer-
ences to other chapters are intended to correlate the topics and to enhance
the usefulness of the book as a reference volume. Each chapter is sub-
divided into functional parts, many of which also form an organized whole.
The earlier parts of each chapter are less advanced and should servo as
an introduction to more difficult topics treated in the later parts. The
text material set in small type usually deals with generalizations and de-
velops the less familiar concepts which are sure to grow in importance in
applications.
The choice of topics is based on the authors' estimate of the frequency
with which the subjects treated occur in applications. The illustrative
material, examples, and problems have been chosen more for their value
in emphasizing the underlying principles than as a collection of instances
of dramatic uses of mathematics in specific situations confronting prac-
ticing engineers.
Although the book is written so as to require little, if any, outside help,
the reader is cautioned that no amount of exposition can serve as a substi-
tute for concentration in following the course of the argument in a serious
discipline. In order to facilitate the understanding of the principles and
to cultivate the art of formulating physical problems in the language of
mathematics, numerous illustrative examples are worked out in detail.
The authors believe with Newton that exempla nan minus doceunt quam
precaepta.
L S. Sokolnikoff
R. M. Redheffer
TO THE INSTRUCTOR
In the sense that a working course in calculus is the sole technical pre-
requisite, this book is suitable for the beginner in applied mathematics.
But when viewed in the light of the present-day requirements of the engi-
neering profession, the text includes a large amount of material of direct
interest to practicing engineers.
It is certain that within the next twenty years the methods of functional
analysis and, in particular, the Hilbert space theory will be in general use
in technology. A foundation for the assimilation of the function-space
concepts should be laid now, and we did not hesitate to do so in several
places in this book.
We have arranged the contents in nine independent chapters which, in
turn, are subdivided into parts, most of which can be read independently
of the rest. The earlier parte of each chapter are less advanced, and our
experience has shown that several introductory courses for students of sci-
ence and technology can be based on the material contained in the earlier
parte. When taken in sequence, this book has ample substance for four
consecutive semester courses meeting three hours a week.
This book is also suitable for courses in mathematical analysis bearing
such labels as ordinary differential equations, partial differential equations,
vector analysis, advanced calculus, complex variable, and so on.
Thus Chap. 1, when supplemented by Secs. 12 to 14 of Chap. 2, has
adequate material for a solid semester course in ordinary differential equa-
tions. Instructors wishing to include an introduction to numerical meth-
ods of solutions of differential equations will find suitable material in Secs.
14 to 18 of Chap. 9. The use of Laplace transforms in solving differential
equations is discussed in Appendix B, which includes, among other things,
a meaningful introductory presentation of the “Dirac delta function.’ '
Chapter 6, together w r ith Secs. 18 to 25 of Chap. 2, has ample material
for a semester course in partial differential equations.
Chapters 4 and 5 have sufficient content for a modem course in vector
analysis.
Chapter 7, preceded by the relevant topics on line integrals in Chap. 5,
is adequate for an introductory course in complex variable theory.
Chapter 8 can be used in a semester course on probability theory and
viii
TO THE INSTRUCTOR
applications meeting two hours a week. A course entitled “Probability
and Numerical Methods” meeting three hours a week can be based on
the material in Chaps. 8 and 9.
Although this book was written primarily for students of physical sci-
ences, it is unlikely that a liberal arts student who followed it in an ad-
vanced calculus course would be obliged to “unlearn" anything in his
subsequent studies.
The contents of this book include what we believe should be the mini-
mum mathematical equipment of a scientific engineer. It may not be out
of place to note that the mathematical preparation of physicists and engi-
neers in Russia exceeds the minimum laid down here. While the curricula
of only a few leading American engineering colleges provide now for mure
than one year of mathematics beyond calculus, their number will continue
to increase with the realization that the time allotted to mathematics is a
sound capital investment, yielding excellent returns both in the time gained
in professional studies and in the depth of penetration.
CONTENTS
Preface v
To the Instructor vii
CHAPTER
1 Ordinary Differential Equations 1
2 Infinite Series ' >107
3 Functions of Several Variables 213
4 Algebra and Geometry of Vectors. Matrices 283
5 Vector Field Theory 353
6 Partial Differential Equations 421
7 Complex Variable 523
8 Probability 605
9 Numerical Analysis 673
APPENDIX
A Determinants 74 1
B The Laplace Transform 754
C Comparison of the Riemann and Lebesgue Integrals 771
D Table of Hz) - — C / e~‘’ /2 dt 776
V 2?T J 0
Answers 777
Index 799
lx
CHAPTER 1
ORDINARY DIFFERENTIAL EQUATIONS
Preliminary Remarks and Orientation
I. Definition of Terms and Generalities 5
2 Tin* Slipping of a Belt on a Pulley 1 1
3. Growth 12
l Diffusion and Chemical Combination 14
5 The Elastic Curve 15
The Solution of First-order Equations
0 Kquat ions w ith Separable Variables 17
7 Homogeneous Diffeiential Equations 18
<S Exact Differential Equations 20
9 Integrating Factors 22
JO The First-order Linear Equation 23
1 1. Equations Solvable tor y or y' 25
12 The Method of Substitution 27
13. Reduction of Order 29
Geometry and the First-order Equation
M Orthogonal Trajectories 30
15. Parabolic Mirror. Pursuit Curves 33
16. Singular Solutions 34
17. The General Behavior of Solutions 36
Applications of First-order Equations
18 The Hanging Cham 40
19. Newton's Law of Motion 42
20. Newton's Law of Gravitation 46
Linear Differential Equations
21. Fan ear Homogeneous Second-order Equations 51
22 Homogeneous Second-order Linear Equations with Constant
Coefficients 54
3
4 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
23. Differential Operators 57
24. N onhomogeneous Second-order Linear Equations 59
25. The Use of Complex Forms of Solutions in Evaluating Particular
Integrals 64
26. Linear nth-order Equations with Constant Coefficients 66
27. General Linear Differential Equations of nth Order 70
28. Variation of Parameters 72
29. Reduction of the Order of Linear Equations 76
30. The Euler-Cauchy Equation 78
Applications of Linear Equations
31. Free Vibrations of Electrical and Mechanical Systems 79
32. Viscous Damping 82
33. Forced Vibrations. Resonance 86
34. The Euler Column. Rotating Shaft 90
Systems of Equations
35. Reduction of Systems to a Single Equation 95
36. Systems of Linear Equations with Constant Coefficients 100
The power and effectiveness of mathematical methods in the study of
natural sciences stem, to a large extent, from the unambiguous language
of mathematics, with the aid of which the laws governing natural phe-
nomena can be formulated. Many natural laws, especially those con-
cerned with rates of change, can be phrased as equations involving deriva-
tives or differentials. For example, when a verbal statement of Newton’s
second law of motion is translated into mathematical symbols, there
results an equation relating time derivatives of displacements to forces.
A study of such equations then provides a complete qualitative and
quantitative characterization of the behavior of mechanical systems under
the action of forces. Several broad types of equations studied in this
book characterize physical situations of great diversity and practical
interest.
The first half of this chapter is concerned with preliminaries and special
techniques devised for the solution of the first-order equations arising
commonly in applications. The second half contains a comprehensive
treatment of linear differential equations with constant coefficients and
an introduction to linear equations with variable coefficients. Linear
equations occupy a prominent place in the study of the response of elastic
structures to impressed forces and in the analysis of electrical circuits and
servomechanisms. They also appear in numerous boundary-value problems
in the theory of diffusion and heat flow, in quantum mechanics and fluid
mechanics, and in electromagnetic theory.
PRELIMINARY REMARKS AND ORIENTATION
1. Definition of Terms and Generalities. Any function containing var-
iables and their derivatives (or differentials) is called a differential expres-
sion, and every equation involving differential expressions is called a
differential equation. Differential equations are divided into two classes,
ordinary and partial The former contain only one independent variable
6 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
and derivatives with respect to it. The latter contain more than one
independent variable.
The order of the highest derivative contained in a differential equation
is called the order of the differential equation. Thus
+ 5y 2 * 0
is an ordinary differential equation of order 2, and
d 2 y
dx dt
+ yxt
0
is a partial differential equation of order 3.
A function y = <p(x) is said to be a solution of the differential equation
F(x,V,V') = 0 , ( 1 - 1 )
if, on the substitution of y ~ <p(x) and y f — v(x) in the left-hand member
of (1-1), the latter vanishes identically. 1 Again, y = <p(x) is a solution
of the second-order equation F(xjjy/,y") = 0 when the substitution
y = <p(x) f y f = <p\x), y" — <p”(x) reduces this to an identity in x. Simi-
larly for equations of order n .
For example, the first-order differential equation
?/ + 2 xy - <-~* S = 0 (1-2)
2 f 2
has a solution y = xe T , because the substitution of y — xe T and //' -
e~ x — 2x 2 e~~ iX in (1-2) reduces it to an identity 0^0. Also, the equation
y" + y = 0
has a solution y — sin x, as can be easily verified by substitution.
We begin our study of differential equations with the first-order equation
(1-1), which w r e suppose can be solved for y f to yield the equation
y' = ( 1 - 3 )
For reasons which will become clear presently, we shall always assume
that f(x,y) is a continuous function throughout some region in the xy
plane, and we shall study the solutions of (1-3) [or, equivalently, of (1-1))
in that region.
The geometrical meaning of the term solution of (1-3) is suggested at
once by the interpretation of the derivative y f as the slope of the tangent
line to some curve y = tp(x ), for if (x,y) is a point on the curve y = <p(x),
1 Here, as elsewhere in this book, primes are used to denote differentiation: y ' ss dy/dx,
y” m d^y/dx 2 , . . j/ (w) * d n y/dx n .
SEC. 1 ] PRELIMINARY REMARKS AND ORIENTATION 7
and if at every point of this curve the slope is equal to f(x,y) f then (p(x) is
a solution of (3-3).
One can get an idea of the shape of the curve y * <p(x) in the following
way: I^et us choose a point (x 0 ,yo) and compute
y' - f(x 0 ,y 0 ). ( 1 - 4 )
The number /(x 0> yo) determines a direction of the curve at (x 0 ,?/o)- Now,
let (x lf 7/j) be a point near (x 0 ,2/o) in the direction specified by (1-4). Then
\j — f{x\,y\) determines a new direction at (xi,yi) (Fig. 1). Upon proceed-
ing a short distance in this new
direction, we select a new point
(x 2 , 1 / 2 ) and at this point determine
a new slope ?/' — f(x 2 ,y 2 )- As this
process is continued, a curve is built
up consisting of short line segments.
If the successive points (x 0 ,//o),
(*i,y 1 ). (xn.Uz), ■■■, (zn,Vn) are
chosen near one another, the series
of straight-line segments approxi-
mates a smooth curve y = <p(x)
which is a solution of (1-3) associ-
ated with the choice of the initial point (xod/o)- A different choice of the
initial point will, in general, give a different curve, so that the solutions of
Eq. (1-3) can be viewed as being given by a whole family of curves. Such
curves are called integral curves , and each curve in the family represents
a particular solution or an integral of our equation.
Also, we can make a surmise that, unless f(x,y) in the right-hand member
of (1-3) is a badly behaving function, for each choice of the initial point
there will be just one solution of Eq. (1-3). This surmise is capable of
proof, which we do not give here because it requires the use of analytical
tools which are not provided in the usual calculus courses. However, the
statement of essential facts is easy to grasp, and since it will facilitate the
understanding of subsequent developments, we give it here as a basic
theorem.
Existence and Uniqueness Theorem. The equation y* — f{x,y) has
one and only one integral curve passing through each point of the region in
which hothf(x r y) and df/dy are continuous functions. 1
Unless a statement to the contrary is made, we shall suppose that the
restrictions imposed on f(x f y) in this theorem are fulfilled, so that Eq.
1 It suffices to suppose that \df/dy\ is bounded in the region. Proofs of this theorem
are contained in many books on differential equations, for example, E. L. Ince, “Ordi-
nary Differential Equations/' p. 62. See also See. 17 of this chapter.
8 ORDINARY DIFFERENTIAL EQUATIONS {CHAP. 1
(1-8) has a unique solution for each choice of (xo,yo) in the appropriate
region of the xy plane.
Since by changing the initial value y\ x » y(x 0 ) we get a family of
curves depending on the arbitrarily chosen value y(x o), the equation of
this family can be written in the form
y = v(x,c ) (1-5)
involving one arbitrary constant c, corresponding to the arbitrary choices
of y(x o). A particular curve of the family (1-5) passing through (x 0 ,y 0 )
is then determined by the value of c such that y Q = <p(x 0) c).
A solution of the first-order equation (1-3) involving one arbitrary
constant is called a general solution } Such solutions are often written in
the implicit form
$(x,y,c) = 0, (1-6)
where it is understood that (1-6) can be solved for y to yield the explicit
form (1-5). In practice it may not be necessary to exhibit the explicit
form. The essential feature of the general solution [be it given by (1-5)
or (1-6)] is that the constant c in it can be determined so that an integral
curve passes through a given point (x 0 ,y 0 ) of the region under consideral ion.
We illustrate this 1 ) 3 ' demonstrating that throughout the xy plane the
general solution of Eq, (1-2) can be written as
y = e~~ x \x + c). (1-7)
The fact that (1-7) is, indeed, a solution is easily verified by substituting
(1-7) in (1-2). Moreover, it is a general solution, because on setting
x = x 0 and y = yo we get
2/o = e~~*%(xo + c). (1-8)
Thus the integral curve passing through (:r 0 ,?/o) corresponds to
c = y 0 e x o - Xo.
As another example consider the equation
dy
ax
where f(x) is any continuous function. A general solution of this equation,
obtained by direct integration, is
y « ff(x) dx + c. (1-10)
1 Some first-order equations may have solutions which cannot be determined from
the general solution for any value of c. Such solutions, called singular solutions^ arise
only when the conditions imposed on f(x,y) in the basic theorem are not fulfilled.
SEC. 1] PRELIMINARY REMARKS AND ORIENTATION 0
We show next that (1-10) is a general solution of (1-9). We denote an
indefinite integral in (1-10) by F(x), so that dF/dx » f(x). Then (1-10)
is the same as
y m F(x) + c. (1-11)
On setting x = x Q> y = y 0> we get
Vo « F{x 0 ) + c,
so that c = y Q — F{x$),
and we can, therefore, write (1-11) as
y = Fix) - F(z 0 ) + 2/o
* F(x)|^ + 2/ 0 . (1-12)
But from the fundamental theorem of integral calculus,
f X f(x)dx~F(i) |*
J Xq
and therefore (1-12) yields the desired particular solution
V = / /(a - ) + J/o, (1-13)
•'XQ
corresponding to the choice of the initial point (x Q ,y 0 ).
Formula (1-13) illustrates the procedure of deducing particular solutions
by integrating the given equation (1-9) between limits. It is frequently
simpler than the procedure of determining the desired solution by calculat-
ing the constant c in the general solution from the initial data.
The foregoing discussion can be extended to equations of higher order.
Thus, the nth-order equation
V{x,y,y' y ln) ) = 0, (1-14)
which we shall write in the form solved for i/ (n) as
y M „ (1-15)
has a unique solution for n arbitrarily assigned initial values,
y{x 0 ), y'{x 0 ), . . . , y {n ~ l) (x 0 ), (1-16)
whenever the function f in (1-15) is continuous together with the partied
derivatives df/dy y df/dy f , . . df/dy (n ~^ l) .
When the values in (1-16) are varied, we get a family of curves, the so-
called n-parameter family , corresponding to n independent choices of
constants in (1-16). The equation of this family of solutions can be written
in the form
y « ?(x,Ci,C 2 ,. .
(1-17)
10
ORDINARY DIFFERENTIAL EQUATIONS
[CHAP. 1
involving n arbitrary constants c % . A solution such as (1-17) is called a
general solution of the nth-order equation (1-15) [or (1-14)], provided that
the constants c* in (1-17) can be determined for every given set of arbitrarily
assigned initial values (1-16). The general solution (1-17) may also appear
in an implicit form as
<f>(x,!/,Ci,C2,---,Cn) = 0, (1-18)
which on solving for y should give (1-17).
The meaning of the initial conditions (1-16), as they bear on the unique-
ness of solution of the second-order equation 1 F(x,y,y\y n ) = 0, is that the
integral curve of this equation is determined at x « x 0 if the ordinate
2/o = y(zo) and the slope y'{x 0 ) are specified.
To determine uniquely the solution of the third-order equation, we must
specify the value of the ordinate 2 /q, the slope y Q , and the value of the
second derivative yl at x = x 0 .
In the following nine sections we shall deal with first-order equations,
which we can write in the differential notation as
P(x,y) dx + Q(x,y) dy = 0. (1-19)
If Q(x,y) it 0, Eq. (1-19) gives
dy _ P(x,V )
dx Q(x,y)
which is in the form (1-3) with f(x,y) = —P(x,y)/Q(x,y).
PROBLEMS
Classify the following differential equations as ordinary or partial, and determine
their orders:
1 .
dx 4 \dx)
3. y' -f sin y + x ® 0;
5. y " -f- x 2 y' + xy *» sin x;
7 . Vy + Vjr - y’;
2 .
d 4 z
dx 4
+ 2
dx dy
d 4 z
dy 4 *
4. dy « V^l - j/ 2 dx ;
8. y' A + y' » I/"'.
Verify that the given expression is a solution of the given differential equation:
9. y ce x , y' « y;
10. 2e v - e* 4- ce“* y' - e*~ v - 1;
11. y Ci sin x + 02 cos x, y" 4- y ■» 0;
12. y *» Ci sinh x 4" cs cosh x, y" — y * 0;
18. xy » Jf(x) dx y xy' 4* y » /(x).
14. Integrate y' » 2x to show that its general solution is a family of parabolas y m x*
4- c. Determine integral curves of this equation through (0,0), (1,1), (0,1), (1,-1).
SBC. 2J PRELIMINARY REMARKS AND ORIENTATION 11
IB. Determine the integral curve for y" » 2i such that j/(0) - 0 and y'(0) — 1. What
is the general solution of this equation?
2. The Slipping of a Belt on a Pulley. To illustrate the prominence of
differential equations in the study of various phenomena, this and the
following three sections are primarily concerned with the task of setting
up differential equations from physical principles. 1 Such solutions as
are included are intended merely as a preview of the systematic discussion
given in the subsequent sections. If he wishes, the reader may confine
his attention to the derivation of the equations only and return to the
question of solution after this systematic discussion has been assimilated.
The first, example is given by the
bell -pulley arrangement of Fig. 2,
which is now to be analyzed Con-
sider an element of the belt, of length
As, which has end points Fund Q and
subtends an angle AO at the center
0. Let T be the tension at P and
T + AT at Q, and let A F be the
normal component of force on As
due to the pulley. Thus A F is the
component, along the radius ON,
of the total resultant, force exclu-
sive of T and T + AT.
Assume that the belt is stationary
and that the pulley rotates, so that
there is slipping. Since the element
As is in static equilibrium, the components of force along ON must
balance. This gives
A 0 AO
(T + AT) sin b T sin — - AF, (2-1)
2 2
provided the weight of the belt is negligible or provided the pulley axis is
vertical. Equating forces at right angles to ON leads to
AO AO
(T + AT) cos T cos — = g AF, (2-2)
2 2
where ju is the coefficient of sliding friction. 2 From (2-2) we may deduce
A T ~ » AF, AO 0, (2-3)
1 Further problems of the sort are treated in Sees. 18 to 20.
* We define n by (2-2) and regard it as an experimental fact that ^ approaches the co-
efficient of friction for flat surfaces or, at any rate, some limit independent of 6 as
AO 0.
N
12
ORDINARY DIFFERENTIAL EQUATIONS
[CHAP. 1
where the symbol ~ (read “is asymptotic to”) means 1 that the ratio of
the quantities on each side tends to 1. Thus, if a ~ b, then lim a/b « 1,
Equations (2-1) and (2-3) together show that AT — ► 0 as Ad — ► 0. Since
sin (A6/2) ~ AB/2 } Eq. (2-1) now gives
T AS ~ A F. (2-4)
Dividing (2-3) by (2-4) leads to AT/ (T Ad) ~ /x, which becomes
dT
¥d$
(2-5)
since lim (A T/AB) = dT/dS .
Separating the variables in (2-5) yields dT/T = n dB y which, upon in-
tegration, becomes log T « pd + c. The initial condition T == T 0 when
B « 0 gives c = log T 0 , so that, taking exponentials,
T « 7V M *. (2-6)
PROBLEMS
1. Obtain Eq. (2-3) by equating torques about the point 0.
2. If the pulley axis is horizontal, and if the belt weighs w lb per ft, show that Eq.
(2-3) becomes n AF ~ AT — w As cos 0 and Eq. (2-4) becomes A F ^ T Ad -f- u> As sin 0,
where A F is the normal component of the reaction of the pulley on As and the line OA
in Fig. 2 is horizontal, with P above it. Deduce the differential equation dT/dd — nT -
wr{p sin 9 + cos $), where r is the radius.
3. Show that the equation in Prob. 2 becomes d(Te~^) m wre"** 9 (ju sin 6 -f- cos 9) dd
when multiplied by and thus obtain the solution.
8. Growth. Equation (2 -5), which was obtained for the tension in a
dipping belt, arises in many other connections. For example, radium de-
composes at a rate proportional to the amount present. If this amount
is A at time t , the foregoing statement means
dA
-=-kA, k> 0, (3-1)
at
the negative sign being chosen because A decreases as t increases. A
similar equation is followed by the growth of populations in certain cir-
cumstances. Thus, the rate of increase being nearly proportional to the
number N present, one can write dN/dt = kN. Again, certain organisms
1 The relation symbolized by ~ has many of the properties of strict equality. For
example, if a ~ b and b ~ c then a ~ c. To see this, observe that a/b — > 1, since
a ~ b; and b/c X, since b ~ c, and hence, by multiplication, (a/b)(b/c) 1 *1. Thus
a/e — ► 1, which is to say, a ~ c. The reader may verify similarly that a ~b and
c ~d together imply ac ~bd and a/c ~ b/d. Finally, if a ~ b and b is constant, we
may write lim a * 6. These properties are freely used in the text.
PRELIMINARY REMARKS AND ORIENTATION
13
SEC. 3 ]
grow at a rate proportional to their size S at a given time so that dS/dt =»
fcS.
Example 1. In a colony of bacteria each bacterium divides into two after a time inter-
val, on the average, of length r . If there are n bacteria at time t - 0 and m at time t « 1,
with n large, find the approximate value of r.
The hypothesis implies that dN/dt « kN, approximately, with greater and greater
accuracy as the number of bacteria N becomes large. Separating variables gives dN /N
*» kdt. Now t *» 0 corresponds to N n, and t «* 1 corresponds to N ** m, by hy-
pothesis. Thus,
(3-2)
and similarly
(3-3)
since N doubles in the interval r. Equation (3-2) gives log m — log n k, and (3-3)
gives log 2 «* kr f so that
log 2
T **
log m — log n
This problem illustrates the useful method of integration between limits for the determina-
tion of constants. A justification of this procedure is implicit in Sec. 1, Eq. (1-13).
Example 2. A radioactive substance A decomposes into a new substance B, which
in turn decomposes into a third substance C. Set up a differential equation for the
amount of B at time t.
The rate of increase of B is equal to the rate at which B is formed from A minus the
rate at which B decomposes. Thus, denoting the amounts by A and £,
dB
dt
dA
dt
- hB .
(3-4)
This equation has two unknowns, A and B. By (3-1), however, A ** ce~* l t so that
(3*4) becomes
dB
* kce~ kt — kiB. (3-5)
dt
A method of solving (3-5) is given in Sec. 10.
PROBLEMS
1. If 3 g of a radioactive substance is present at time t *» 1 and 1 g at t =* 4, how much
was present initially?
2. In Example 2 of the text set up the differential equation for the amount of substance
C present at time t.
3* By actual substitution, determine a and p in such a way that B « ae ** is a solu-
tion of Eq. (3-5).
4. The rate of decomposition of a certain chemical substance is proportional to the
amount of the substance still unchanged. If the amount of the substance at the end of
t hr is x and xo is the initial amount, show that z m where k is the constant of
proportionality. Find k if x changes from 1,000 to 500 g in 2 hr.
14
ORDINARY DIFFERENTIAL EQUATIONS
[CHAP. 1
5. A torpedo moving in still water is retarded with a force proportional to the veloc-
ity. Find the speed at the end of t see and the distance traveled in t sec if the initial
speed is 30 mph.
0. The rate at which a body is cooling is proportional to the difference in the tempera-
tures of the body and the surrounding medium. It is known that the temperature of a
body fell from 120 to 70°C in 1 hr when it was placed in air at 20°G. How long will it
take the body to cool to 40°C? 30°C? 20°C?
7. The percentage of incident light absorbed in passing through a thin layer of material
is proportional to the thickness of the material. If 1 in. of material reduces the light to
half its intensity, how much additional material is needed to reduce the intensity to
one-eighth of its initial value? Obtain the answer by inspection, and check by solving
an appropriate differential equation.
4. Diffusion and Chemical Combination. Problems involving chemical
reactions and the formation of mixtures often lead to differential equations;
the discussion is similar to that of Sec. 3. For example, suppose that a
tank contains g gal of water and that brine containing w lb of salt per
gallon flows into the tank and out again at a constant rate r gpm, starting
at time t — 0. At the same time a piece of rock salt is dropped into the
tank, where it dissolves at a constant rate of q lb per min. The mixture
being kept uniform by stirring, it is required to find the amount of salt
present at any time t > 0.
This problem may be taken as the typical problem for many questions
involving chemical reactions, mixing, and going into solution. The dif-
ferential equation is obtained by writing down the equation of continuity
(increase equals income minus outgo) fur the amount of salt. Call this
amount x » x(t) at time t. In the time interval from t to t + At the
number of gallons entering the tank is r At, since the rate of flow is r.
Now each gallon contains w lb of salt. Hence the r At gal contains
wr At (4-1 )
pounds of salt, and this, then, represents income due to the inflowing brine.
The income due to the dissolving salt is
q At , (4-2)
by the definition of q.
It remains to compute the amount of salt lost in the mixture leaving
the system. The number of gallons leaving is r At, the concentration of
the mixture in pounds per gallon is x/g at time t , and hence the number
of pounds leaving is
- r At. (4-3)
Q
Here £ denotes the mean value of x over the interval ( t , t + At). We
assume x to be continuous, so that
SEC. 5]
PRELIMINARY REMARKS AND ORIENTATION
15
(4-4)
lim x ® x.
&t -* o
From (4-1), (4-2), and (4-3) we obtain
Ax « wr At + q At r At,
9
which gives
dx rx
— a wr + a
dt g
when we divide by At and let At 0, using (4-4).
(4-5)
Example: Find the concentration of salt at the end of 4 min when w « 1, g 2
q *** 3, r ** 4.
The differential equation is dx/dl * 7 — 2x or &r/(7 — 2x) ** d/. Multiplying by ~2
and integrating give log (7 — 2x) = —2t -f c. Since x — 0 when t — 0, it is necessary
that c = log 7, so that ~~2t «* log (7 — 2x) — log 7 * log (1 ~ 2x/7) or, taking exponen-
tials, 1 - 2x/7 ** c" 2 *. This gives the amount of dissolved salt x at the end of t min.
Putting t «= 4, solving for x , and noting that the concentration is not x but x/g give
J4O - e" 8 ) as the final answer.
PROBLEMS
1. Solve the example of the text by the method of integration between limits. (See
Example 1, See. 3. Here x = 0at< = 0, x=*xat^=»4)
2. How would the discussion in Sec 4 change if the rock salt had been added at time
t «■ io instead of time l ■» 0?
3 . How would the discussion in Sec 4 change if the rock salt dissolved at a rate pro-
portional to the amount undissolved, rather than at the constant rate q‘l Hint ■ If A is
this amount, dA/dt ** — kA . From this find A at time t , and from that find q * — dA /dt.
4 . Let A be the amount of a substance at the beginning of a chemical reaction, and let
x be the amount of the substance entered in the reaction after t sec. Assuming that the
rate of change of the substance is proportional to the amount remaining, deduce that
dx/dt » c( A — x), where c is a constant depending on the reaction. Show that x ®
A{ 1 - e" cl ).
6. Ixd. a solution contain tw r o substances w r hose amounts expressed in gram molecules,
at the beginning of a reaction, are A and B. If an equal amount x of both substances
has changed at the time t f and if the rate of change is jointly proportional to the amounts
of the substances remaining, obtain the equation dx/dt = fc(A — x){B — x). Solve,
assuming that x ** 0 when t *= 0.
6. Formulate the appropriate differential equation if the rate at which a, substance
dissolves is jointly proportional to the amount present and to the difference between the
actual concentration and the saturate concentration.
5. The Elastic Curve. Consider a horizontal elastic beam under the
action of vertical loads. It is assumed that all the forces acting on the
beam lie in a plane containing the central axis of the beam. Choose the
x axis along the central axis of the beam in undeformed state and the posi-
16
ORDINARY DIFFERENTIAL EQUATIONS
[CHAP. 1
tive y axis down (Fig. 3). Under the action of external forces Fi the beam
is bent and its central axis deformed. The deformed central axis, shown
in the figure by the dashed line, is known as the elastic curve, and it is an
important problem in the theory of elasticity to determine its shape.
A beam made of elastic material that obeys
F, *2 F 3 Hooke’s law is known to deform in such a way
that the curvature K of the elastic curve is
proportional to the bending moment M. In
fact,
y" M
K — — (5-1)
[1 + (y') 2 ]* El
Fig. 3 where E is Young’s modulus, I is the moment of
inertia of the cross section of the beam about a
horizontal line passing through the centroid of the section and lying in
the plane of the cross section, and y is the ordinate of the elastic curve.
The important relation (5-1) bears the name Bernoulli-Euler law . When the
deflection of the beam is small, the slope of the elastic curve is also generally
small and one can neglect the term ( y ') 2 in (5-1) to obtain an approxi-
mate equation
y
tt
M
~ei
(5-2)
The bending moment M in any cross section of the beam is equal to
the algebraic sum of the moments of all the forces F t acting on one side
of the section. The moments of the forces F t are taken about a horizontal
line lying in the cross section in question.
Example: Consider a cantilever beam of length l t built in at the end x «* 0 and carry-
ing in addition to a distributed load ic(x) lb per ft a concentrated load W lb and a couple
L ft>-lb applied at the end x *= l (Fig. 4).
The resultant moment in a cross section x ft from the end x *■ 0, produced by the
loads acting to the nght of that section, is
Mil) - - X)w(i) d£ + W(l - x) + L. (5-3)
Fio. 4
THE SOLUTION OS' FIRST-ORDER EQUATIONS
17
SEC. 6]
If w(x) m 0 and L «* 0, this formula yields M m W(l — x)> and hence, from (5-2), the
differential equation of the central line of a cantilever beam subjected to the end load If is
On integrating this equation we get
r \2 6 )
+ CiX + Oj.
The integration constants Oj and c -2 can be evaluated from the conditions y( 0) * 0,
y'(Q) « 0, stating that the displacement and the slope of the central line vanish at the
built-in end. It is readily checked that these conditions lead to
(*•-?>
so that the displacement d at the free end is d * WP/ZEI.
PROBLEMS
1. A beam of length l is freely supported at its ends and is loaded in the center by a
concentrated ve*rtical load IT, which is large in comparison with the weight of the beam
(see Fig. 5). By symmetry, the behavior of tins beam is the same as that of a cantilever
beam of length 1/2 loaded by a concentrated
load of magnitude If/2 at its free end. Verify W
this equivalence by direct computation of the
elastic curve. Hint:
M ** — 0<x<^> | SSVs> * ■- — -- — ””
w w
W l 2 2
- - -- (J - x ) » -<x <1 FlG. 5
2. A uniform unloaded beam of length l weighs w lb per ft. Find the maximum de-
flection when it is used as a cantilever beam and also when it is freely supported at each
end. Hint: Since the reaction at the end x ® l is R » If/2, the moment in the cross
section at a distance x from the end x » 0 is
AT - wj (£ - x) d£ - ~ (1
THE SOLUTION OF FIRST-ORDER EQUATIONS
6. Equations with Separable Variables. Generally speaking, the prob-
lem of solving differential equations is a very difficult one. Even such a
simple equation as y' = f(x f y) cannot be solved in general; that is, no
formulas are available for solving the general differential equation of the
first order. It is possible, however, to classify some of the first-order dif-
ORDINARY DIFFERENTIAL EQUATIONS
18
[CHAP. 1
ferential equations according to several types and to indicate special
methods of solution suitable for each of these.
Prominent among these types are the equations with separable variables f
that is, equations which can be put in the form
P(x) dx + Q{y) dy * 0,
where P(x) is a function of x only and Q(y) is a function of y only. This
type of equation has already been encountered in the special examples
solved above. Its general solution is
Jp(x) dx + J Q(y ) dy = c,
where c is an arbitrary constant. In order to obtain an explicit solution
all that is necessary is to perform the indicated integrations.
Example: Find a solution of y 4- c r y * e r y 2 which goes through (0, V*j).
The equation can be written as y f -f- e z (y — y 2 ) * 0 or
Integration gives
e x dx *= 0.
log h e r « c,
1 - y
which is a general solution. Putting x » 0, y = gives c == log 1 -f- e° » 1, so that the
required particular solution is
log h * 1.
I - y
PROBLEMS
Solve the following differential equations. In Probs. 4 to 6 find a solution through the
point (0,1).
1. Vl — x 1 dy * VY — y 2 dx. 2. y ' = xy 2 — x.
« , sin2 x A ■ 2 j 2 ^
3. y « — 4. sin x cos^ y dx « cos^ x dy.
sin y
5. VTf x dy « (1 -f I/ 2 ) dx. 6 . y' » •
1 +z
7. Homogeneous Differential Equations. A function /(x,y) of the two
variables x and y is said to be homogeneous of degree n provided that
f(Xx,Xy) a \ n f(x,y), X > 0.
Thus, f(z,y) = x 3 4* + y 3 is a homogeneous function of degree 3,
and /Or, 2 /) = x 2 sin (x/y) + xy is a homogeneous function of degree 2, as
follows at once on replacing x by Xx and y by Xy.
If the differential equation is of the form
SBC, 7] THE SOLUTION OF FIRST-ORDER EQUATIONS 19
P{x,y) dx + Q(x f y) dy * 0, (7-1)
where P(x } y) and Q(z,y) are homogeneous functions of the same degree,
then (7-1) can be written in the form
y' ~
P(*,y)
Q(x,y)
s <t>(*,y),
(7-2)
where 0(x,y) is a homogeneous function of degree zero; that is,
<t>(\x,\y) s X°^(x,2/) as 0(x,y).
If X is set equal to l/x, then
<t>(x,y) s <#»(Xx,Xy)
which shows that a homogeneous function of degree zero can always be
expressed as a function of y/x . This suggests making the substitution
y/x = v. Then, since y = rx,
dy
dx
dr
— x + v.
dx
Substituting this value of dy/dx in (7-2) gives
dv
X — -f V =
dx
This equation is of the type considered in Sec. 6. Separating the variables
leads to
dv dx
0(1, v) — v x
which can be integrated at once.
Example: Solve
y* -f- x
dy
dx
xy
dy
dx
This equation can be put in the form
dy y^_
dx
y 2 m {y/x) 2
xy - x 2 y/x - 1
Letting y/x « v and computing dy/dx from y ® vx give
p + i
dv
dx
v *
v ^T
dv
dx v — 1
Separation of the variables leads to
dx 1 — v
dv <
ORBINARY DIFFERENTIAL EQUATIONS
20 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
and integration yields log x -f log v — v — c or log vx — v m c. Since v m y/x, the final
answer is log y — y/z ■» c.
PROBLEMS
Solve the following differential equations. In Probs. 4 to 6 find a solution through
the point (1,1).
1 . ( x 2 -f y 2 ) dy 4- 2 xy dx * 0.
ydy y
3. x cos -—-=*=?/ cos x.
XOX X
5 * x 2 y dx » (x 3 — y 3 ) dy.
2. xy' - y Vx* — y 2
4. (x -f y)y' ** x -y.
- dy xy - y 2
Some of the following equations are separable; some are homogeneous. Solve them.
7. sinh x dy + cosh y dx =» 0.
9. x(V7y + y)dx » x 2 dy.
11. xy' ** y + xe v/a! .
dx x - Vxy
10. x 2 y' — y 2 ■» x 2 yy'.
12. y' =* y' log y + tan x see 2 x.
8. Exact Differential Equations. An expression P(x,y) dx + Q(x,y) dy
is said to be exact if it coincides with the differential
dF dF
dF = — dx H dy
dx dy
of some function F(x,y), that is, if
dF dF .
P(x,y) dx + Q(x,y) dy = — dx dy. (8-1)
dx dy
In these circumstances the equation
P(x,y) dx + Q(x,y) dy = 0 (8-2)
is simply dF = 0, and its general solution, therefore, is
F{x,y) « c. (8-3)
When a function F(x,y) satisfying the relation (8-1) exists, we conclude
that
dF dF
— * P(x,y), — = Q(x,y). (8-4)
dx dy
Moreover, if d 2 F/(dx dy) = d 2 F/(dy dx), we obtain by differentiating (8-4)
a necessary condition,
dP dQ
— « — ( 8 - 5 )
dy dx
for the existence of F(x,y). This condition also suffices to construct F(x,y)
in every rectangular region throughout which P, dP/dy, and dQ/dx are
SEC* 8] THE SOLUTION OP FIRST-ORDER EQUATIONS 21
continuous. 1 Indeed, on integrating the first of Eqs. (8-4) with respect to
x , we get
=* / P(x,v) dx + f(y), ( 8 - 6 )
where f(y) is an arbitrary differentiable function of y, since y appearing
in the integral of (8-6) is treated as a constant. We next determine f(y)
so as to satisfy the second of Eqs. (8-4). Differentiating (8-6) with respect
to y and equating the result to Q(x>y) give
so that
This determines
dF d
— - - PM dx+f'(y) = Q(x,y),
dy dy J
f'iv) - Q(x,y) - ■— / P{x,y) dx.
dy 1
f(y) = / 1 Q(x,y) ~ ^ / P(x,y) dx] dy,
(8-7)
provided the expression in the brackets in (8-7) is a function of y only.
But that is always the case, since its derivative with respect to x is dQ/dx —
dP/dy, and this vanishes whenever (8-5) holds Accordingly, the substitu-
tion of f(y) from (8-7) in (8-6) gives the function F(x,y) and thus the de-
sired solution F{x,y) — c.
Example: Solve the equation
(2 xy + 1 )dx + ( x 2 -f 4 y) dy - 0.
This equation is exact, since dP/dy *= dQ/dx ~ 2x. Thus there is a function F(x t y) such
that
dF
dx
2 xy 4- 1,
OF
dy
x 2 -f 4y.
( 8 - 8 )
From the first of Eqs. (8-8) we conclude that
F(x,y) - j(2xy + 1) dx +f(y)
- x 2 y 4- x 4-/(y). (8-9)
To satisfy the second of Eqs. (8-8), we must have
d P 1
— •• ** +/'(y) - i ! + 4»
so that f\y ) ** 4y.
The integration yields
/(If) - V,
and the substitution in (8-9) gives F(x,j/) « z 2 i/ 4* ^ + 2y J . The desired solution, there-
fore, is
x f y 4* x + 2y* « c.
1 For details and general discussion see Chap. 5, Sec. 9.
22
ORDINARY DIFFERENTIAL EQUATIONS
[CHAP, 1
PROBLEMS
Integrate the following equations if they are exact:
1. («* *f 3) dx + dy «* 0; 2. ( 2x -f - e xl *) dx « \ e xtv dy;
\ V / 2
5. (3x 2 y ~ y z ) (h ** (3y 2 x - x 3 ) dy; 4. x dy + y dx » 0;
V V 1 y
6. -*5 cos - dx * - cos - dy; 6. x dx *f y dy «* 0;
x* x x x
7. (3 x 2 y - y 3 ) dx - (x 3 + 3y*i) dy - 0;
8. (y cos xy + 2x) dx -f x cos xy dy ~ 0;
9. (y 2 + 2xy + 1) dx + (2xy + x 2 ) dy - 0;
10. 3 x 2 y dx + (x 3 -- 3y 2 x 2 ) dy - 0.
9. Integrating Factors. Suppose that
has a solution
M(x,y) dx + N(x,y) dy — 0
F{x,y) = c,
(9-1)
(9-2)
where F(x,y) is a differentiable function. On differentiating (9-2) with
respect to x , we get
dF dF
dx
+ —</' = o,
dy
(9-3)
and from (9-1) we find
M(x,y) + N{x } y)y' = 0.
(9-4)
The elimination of y ' from (9-3) and (9-4) gives
dF/dx dF/dy
M{x,y) N (x,y)
M(x,y),
(9-5)
where is the value of the common ratio. It follows from (9-5) that
dF dF
— = M(ar,y)JI/ («,y), — = n(x,y)N(x,y)
dx dy
and hence that
y(x,y)(M dx + N dy) = 0
is an exact equation; namely, it is the equation dF ~ 0.
The function y(x f y) is termed an integrating factor. It is clear from the
above discussion that every equation (9-L) has an integrating factor and,
in fact, an unlimited number of them. 1 Nevertheless, it must not be con*
eluded that an integrating factor can always be found easily. In simpler
cases, however, it can be found by inspection.
1 Some integrating factors introduce extraneous solutions y which make y(x,y) « 0 but
do not satisfy (9-1).
SEC. 10] THE SOLUTION OF FIRST-ORDER EQUATIONS 23
Thus, in order to solve
xdy — ydx = 0
which is not exact as it stands, multiply both sides by l/xy . Then the
equation becomes
dy dx
= 0 ,
y x
which is exact. Another integrating factor for this same equation is l/x % .
Similarly, multiplication by 1 /y 2 makes the equation exact.
Example: Solve the differentia! equation
(y 2 — x 2 ) dy 4- 2 xy dx « 0.
This is not an exact equation, hut on rearrangement it becomes
y 2 dy 4- 2xy dx — x 2 dy « 0,
which can be made exact with the aid of the integrating factor 1/y*.
equation is
2ri/ dx - x 2 dy
dy + - t 0,
which integrates to
1/4 — = c.
y
The resulting
PROBLEMS
The following problems give a few of the integrable combinations that commonly oc-
cur in practice. Verify the equations by differentiating:
4 , (. _i A xdy - ydx
1. d (Un J - + yi ,
8 . d (f ) =
\y / y“
6. d(x 2 4- y 2 ) * xdx 4- y dy;
' t . d ( log y)„ x Jv^i d *
\ x/ xy
4 . d (v\ = z*l^s±
\x/ X 2
6. d{xy) * x dy 4- y dx.
x dy — ydx
Solve the following equations by finding a suitable integrating factor:
7 . x dy 4- x 2 dx — y dx;
9 . xdy -j- Zy dx * xy dy;
11. xdy — ydx « xy dy;
8. (xy 2 4 -y)dx =* (x 2 y - x) dy;
10. (x 2 4- y 2 4- 2x) dy « 2y dx;
12. (x 2 - y 2 ) dy * 2xy dx.
10. The First-order Linear Equation. An equation of the form
+ M(x)y = N(x)
is termed linear for reasons given in Sec. 21.
24 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
If we set y » uv, where u and v are functions of $ to be determined later,
we get on substitution in (KM)
uv f + vu f + Muv * N,
or v(u' + Mu) + uxf — N. (10-2)
If u is suitably chosen, the parenthesis in (10-2) can be made equal to
zero, thus reducing (10-2) to a simpler form. To this end, set
u ' + Mu = 0, (10-3)
which is a separable equation for u. We get
du
b M dx = 0,
so that log u + J M dx = c. (10-4)
Since any solution of (10-3) reduces (10-2) to the form
uv' » A r , (10-5)
we choose the simplest one, corresponding to c = 0. With this choice,
(10-4) yields
u = e-f v ‘ u , ( 10 - 0 )
and (10-5) becomes
v' = NeJ M d \ (10-7)
Since the right-hand member in (10-7) depend", only on x, we get, on in-
tegrating,
: fNef" 1
’ dx + r.
Kecalling the assumption that // = uv, we get the general solution
y = <>-/" d 'j Nef M dr di + ce~S M dx . ( 10 - 8 )
Example I. Solve y' + yvoax * un 2x. Here M(x) ** cosx and N(x) sm 2x.
Since Jm dx =* Jqoaxdx ** sih x, (10-$) yields
y zm € ~ Bl » r J * sin 2x dx re““ Bm *,
which is easily evaluated by replacing sm 2x by 2 sin x cos x.
Example 2. Solve (x -f 1 )//' *f 2 y » (j + l) 4 . Dividing by x + 1 shows that this
equation is linear with M 2 f(x 4- 1) and N « (x + l) 3 . Henre
i*2 dx
f f xt dx - c J *+i - c 21 ® (x+1) =(i + l) 2 ,
r~/' v = ( r f I)" 2 .
ulule
25
BBC* 11J THE SOLUTION OF FIRST-ORDER EQUATIONS
Thus (10-8) yields
y - (x + l)-*f (x + l fdx + c(x + i)~ 3
(x + l) 4
6
+ c(x *f 1)~ 2 .
PROBLEMS
Solve the following equations. In Probs. 3 to 5 find a solution through the point
( 0 ,- 1 ).
1. (1 + x 2 ) dy “ - x l /) dx. 2. (z 2 + 1)2/' + 2xy » x 2 .
3. y' * e'** — 2xy. 4 . y' 4- xy — x *= 0.
6. y r -f y cos x ** cos 8 x. 6, xy' -f V ”* a: 2 sin x.
7. Show, on writing Eq. (10-1) in the form
dy + A fy dx — N dx,
that eJ M dx is an integrating factor of this equation, and thus obtain formula (10-8)*
Solve the following equations, each of which is separable, homogeneous, exact, or
linear. (It is instructive to use several methods when possible.)
8, y' » y 4* cos x — sin x.
dx
10. - - + yx * y
dy
12. y' 4- yx ■» //.
q dy ^ y 2 - x/r 2 - y 2
dx xy
11. x 2 (l + 4 y z ) dx + 3yx 8 dy - 0.
13. — — - dx + (1 - e v ) dy » 0.
V
11. Equations Solvable for y or y'. Certain special types of equations
can he solved by writing p ~ dy/dx and expressing p as a function of
x and y. Another method is to solve for y in terms of x and p and then
differentiate with respect to x, using dy/dx = p. These procedures change
the given first-order equation into a new one.
Example 1. Solve 2 p ? — (2 y 2 4* x)p + xy 2 « 0, where p ■» dy/dx.
Factoring gives (/> — y 2 )(2p - x) « 0 so that, at each x, we have either p — y* or
p » x/2. The fact that y is to be differentiable ensures that one or other of these rela-
tions actually holds throughout an interval. Hence, with p *» dy/dx, they can be re-
garded as differential equations and solved m the ordinary way. From dy/dx » y 2
there results
x 4. 1 * Ch (11-1)
y
and from dy/dx * x/2 is obtained
x 2
y « - + c 2 .
( 11 - 2 )
These two sets of curves represent the desired solution. Although there is no advantage
in doing so, one may write (11-1) and (11-2) as a single equation with a single parameter,
26
ORDINARY DIFFERENTIAL EQUATIONS
[chap. 1
(* + i + c) (v-J + ') -0.
Example 2. Solve p h vy *» 1, where p — dy/dx.
Since it is impractical to solve this equation for p to obtain p ** f{y) (which would
have led to a separable equation), we solve it for y and obtain
V
Differentiating (11-3) with respect to x leads to
which can be written as
After integration we get
dy l dp „dp
dx ^ p* dx ^ dx'
dx =» — - dp — 4 •©* dp.
(11-3)
(11-4)
which, together with (11-3), gives the desired solution in parametric form. There is no
advantage in eliminating the parameter p in Eqs. (11-3) and (11-1), even when it is
possible to do so. Plotting the curves representing the solution as p varies, one obtains
not only the locus (x,y) but also the slope p at each point.
The method used to solve the equation in the preceding example can be
applied to solve the Lagrange equation
v = xf(y') + g(y'), (n-5)
where / and g are differentiable functions of y f ~ p. On setting y' = p
in (11-5) one obtains
y = *Kv) + y(v)- (n-o)
Differentiating with respect to x yields
V
dp dp
xf(p)^ + f(p) + g f (p)^
dx dx
which can be written as
dx ^ f f (p) ^ ^ g'(p)
dp p ~ f(p) p - f(p)
(11-7)
This equation is linear in x; that is, it is of the form dx/dp + M(p)x * N(p),
and it can be solved by the method of Sec. 10. Its solution for x as a
function of p, together with (11-6), yields the solution of the original
equation in parametric form, with p as parameter.
The reader will find it instructive to apply this method to solve y «
xy f + (|/') 2 and show that y *** cx + c 2 .
SEC. 12]
THE SOLUTION OF FIRST-ORDER EQUATIONS
27
PROBLEMS
Problems 1 and 2 are to be solved by the method of Example 1 ; Probs. 3 and 4 by
that of Example 2.
1. p 2 - 2 yp » 3 y 2 .
3. p 4 « p 2 y -f 2.
5. p 2 x — p|/ 4" 1 ® 0.
7. p 2 4* (2x - y)p « 2xp
3. x « 2p - p*.
10. Show that Clairaut’s equation y ** xp 4-/(p) is a special case of Lagrange’s equa-
tion (1 1-5), and thus obtain the solution.
12. The Method of Substitution. Many first-order equations can be
solved by a suitable change of variable. This has already been demon-
strated in the substitution y ~ vx for the homogeneous equation (Sec. 7),
in the substitution y ~ uv of Sec. 10, and in the use of p = dy/dx as
independent variable in Sec. 11. Further examples of the substitution
method are given in this section.
Thus, the Bernoulli equation
Vf + P(x)y - Q(x)y * ( 12 - 1 )
can be reduced to a linear equation by setting z = 2 / 1 ~ n .
On dividing (12-1) by y n , we get
iTV + P(x)y^ x « Q(z).
But since (y l ~ n Y = (1 — ri)y~ n y', we can write this as
~ — (y l ~ n Y + P(x)y l ~~ n = Q(x)>
1 — n
On making the substitution z — // ~~ n , we get the linear equation
2 ' + (1 ~ n)/ > (x)2 = (1 — n)Q(x), (12-2)
which is solvable by the method of Sec. 10.
The equation
t ~ '( rxr4r ) <“«
ax vOjX 4 ~ i>2y ”f* 03 /
can be solved by the substitution x — w — /i, = p — fc if the constants
h, k are chosen so as to make the resulting equation homogeneous. This
procedure, which is simply a translation of axes, is illustrated in Example 2.
Because of the habitual use of the notation dy/dx , which implies that
y is a dependent variable, one may fail to recognize that an equation is
solvable if the roles of x and y are interchanged. For example, an equation
which is nonlinear in y may become linear if x is regarded as the unknown
and y is regarded as the independent variable. If an equation seems in-
2 . p 2 4- 1 - 2 p.
4 . p 3 4- 2p - e v .
6. p 2 4- V 2 * 1.
8 . p 2 4* (x — e*)p ~ xe x .
28
ORDINARY DIFFERENTIAL EQUATIONS
[chap. 1
tractable as it stands, it is often helpful to interchange x and y , simplify,
and attempt to solve the new equation. Then interchange x and y in
the solution of this to obtain the solution to the original equation. The
procedure, which is illustrated in Example 3, amounts simply to the change
of variable x « y } y = x.
Example 1. Solve the equation y’ 4 - y ** xif.
This is a special case of Bernoulli's equation. Set z =» y~ 2 to obtain z' — 2z *» ~2x
by direct calculation or by (12-2). The general solution is z * ce 2x 4-x 4 3 2. so that
j/~ 2 « cc 2 * 4“ is the solution of the original equation.
Example 2. Solve
dy x — t/ — 2
dx x 4* 2/ 4 6
by means of the substitution x ~ u — h, y ** v ~ k, where h, k are suitably chosen
constants.
Substituting gives
dv ^ u -v ~ (h ~ k 4 2)
du u 4* v — (h 4- k ~ 6)
ff h and k are so determined that
h — k 4 2 * 0,
h -f k - 6 « 0,
(12-5)
then (12-4) becomes the homogeneous equation
dv u — v
du u 4- 0
whose solution is
w 2 — 2uv — v 2 ** ci
by Sec. 7. Equations (12-5) give h = 2, A; « 4, so that u »
A; *■ 4* 4. Substitution in (12-6) leads to the final answer
x 4 ~ h
(12-6)
x 4 2, v * y 4
x 2 — 2xy — y 2 ~ 4x — 12j/ = c
after simplification.
Example 3. Solve (x — f) dy ** y dx.
Interchanging x and y gives ( y - x 8 ) dx » x dy or y' — y/x * —x 2 . This equation
is linear in y and gives 2y « cx — x 8 by the method of Sec. 10. Hence the solution of
the original equation is 2x =* cy — f .
Example 4. Show how to solve the equation 3/' ** P(ax 4- fy/ 4* c), where a, 6, c
are constant.
Let z = ax 4* hy 4* c, so that s' «* a 4 by'. Combining this with the original equa-
tion gives 2' — a « by' *» 6P(ax 4“ hy 4- c) « 5P(«), or » a 4 6P(z). This equation
is separable. The procedure fails if b *# 0, but then the original equation is separable.
PROBLEMS
Solve the following special cases of Bernoulli’s equation:
1, f-j~~ 1 — « sm x;
ax %
a- »' + v -
sec. 13]
3
8* xy f 4 V
THE SOLUTION OP FIRST-ORDER EQUATIONS
29
i?l + ± mg *.
%/dx xy® '
V 2 log x;
4. y' - x~*y 4- x~V - 0;
6. 1/' 4 xy « x 8 y* .
Reduce the following equations to a form which is homogeneous or has separable
variables, but do not solve;
7. y' -
9. y' -
x 4 y - l m
2x 4 y 4" 2 f
. * — y 4* 2
8. y'-
10. y'
3 x 4 y 4 6
3x4l/47 ;
cos (x 4 i/).
x 4 y 4 3
Solve by interchanging x and y and using an appropriate method on the result:
dx
11 . 08
dy
«Vi
12. y dx - (x 4 y 3 ) dy;
13, eV; 14. 1 4 xy' tan y * y'.
Solve the following review problems by any method:
16. ?/(l 4 x 2 ) 1 dx 4 tan 1 x dy « 0;
_ dx x . .
17. 4 - 4 y l
dy v
0 ;
19. e x y' » e* 4 e v ;
21. dx 4 2x dy « y dy ;
23. dy - (2y 4 c 3 ') dx;
26. (x - y 4 1) dx 4 (x 4 V - 1) dy «
13. Reduction of Order. With y'
d
“ - 7,
dp
■y
16. (1 4 x 2 ) dy =*(14 y 2 ) dx;
18. sin 2 y dx 4 2x cos 2y dy *» 0;
20. dx ® (yx 3 — x) dy;
22. (x 2 4 y 2 ) dx ~ xy dy;
24. y 2 * (xy - xV)y';
0 .
= p, the transformations
dp
=r »
dx
dp dy dp
^ dx dy dx dy ^
(13-1)
(13-2)
often enable us to reduce an equation of second order in y to one of first
order in p. For example, the equations
F(x,y',y") = 0 , (13-3)
F(y,y',y") = 0 (13-4)
become by (13-1) and (13-2), respectively,
44 ) -°-
F (y,v,v
dp\
dy)
1 = 0 .
(13-5)
(13-6)
These are first-order equations in p, and when p has been found, the sub-
stitution p «■ y' yields a first-order equation for y.
30 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
Example 1. Find the solution of y" sin y* ~ sin x which satisfies the conditions
y(l) ** 2 and y'(l) * 1.
Being free of y the equation has the form (13-3), and it becomes
dp .
sin p =* sin x
dx
by (13*1) . Solving this separable equation yields
— cos p ** — cos x -f- c,
which is reduced to p ** x by the condition p *» 1 at x » 1 . Writing dy/dx for p in the
equation p ~ x gives on integration the final answer
V = bzir 2 -l)+2
in view of the condition ?/(l) *= 2
Example 2. Solve yy" — 2 (y f ) 2 4 - y 1 =0.
This equation has the form (KM), since it does not contain x. The transformation
(13-2) gives
yp - ~p~ + .i/ 2 = 0,
dy
which is a homogeneous equation with y as independent variable. Setting p *» vy and
proceeding as in Sec. 7 give, after calculation,
V - ±yV lTTV. (13-7)
With p ** dy/dx in (13-7) we separate variables to obtain the final answers,
x 4- cj — =F sinb"' 1 ~ •
cy
PROBLEMS
Problems I and 2 are to be solved by the method of Example 1, Frobs 3 and 4 by that
of Example 2, and Probs. 5 to 7 by whichever method is more suitable.
1. (1 - x 2 )y" ~ xy f . 2. x(y" + y *) - y'.
8. y" 4- * 0. 4. y" « yy'.
8. x 2 y" - 1 - X. 6. yy" - y' 2 .
7. xy" •* 4x - 2y'.
8 . Solve y" ® 1 4- j; 7 * by both methods of this section, and verify the agreement of
the results.
GEOMETRY AND THE FIRST-ORDER EQUATION
14, Orthogonal Trajectories. In a variety of practical investigations,
it is desirable to determine the equation of a family of curves that intersect
the curves of a given family at right angles. For example, it is known that
the lines of equal potential, due to a distribution of steady current flowing
GEOMETRY AND THE FIRST-ORDER EQUATION
31
SEC. 14]
in a homogeneous conducting medium, intersect the lines of current flow
at right angles. Again, the streamlines of a steady flow of liquid intersect
the lines of equal velocity potential (see Chap. 7, Sec. 19) at right angles.
Let the equation of the given family of curves be
f(x f y,c) « 0, (14-1)
where c is an arbitrary parameter. By specifying the values of the param-
eter c, one obtains a family of curves (see solid curves in Fig. 6). Let
it be required to determine the equa-
tion of a family of curves orthogonal
to the family defined by (14-1).
The differential equation of the
family of curves (14-1) can be obtained
by eliminating the parameter c from
(14-1) and its derivative,
dx dy dx
(14-2)
Let the resulting differential equation be
'WD- 0 -
Now, by definition, the orthogonal family of curves cuts the curves of
the given family (1 1-1) at right angles. Hence, the slope at any point
of a curve of the orthogonal family is the negative reciprocal of the slope
of the curves of the given family. Thus, the differential equation of the
desired family of curves is
F
This is a differential equation of the first order, and its general solution
has the form
= 0. (14-3)
The family of curves defined by (14-3) is the desired family of curves
orthogonal to the curves of the given family (14-1). It is called the family
of orthogonal trajectories .
If the equation of a family of curves is given in polar coordinates as
f(rfi,c) * 0, the tangent of the angle a made by the radius vector and the
tangent line at any point (r,0) of a curve of the family is equal to r dB/dr
32
ORDINARY DIFFERENTIAL EQUATIONS
[chap. 1
(Fig. 7). Hence, by the preceding discussion, the differential equation
of the orthogonal trajectories of the given family of curves is obtained by
replacing rd$/dr by —dr/(rd$) in the differential equation of the given
family of curves.
Excumpk Let it be required to find the family of curves orthogonal to the family of
curies (tig. Hj
x 2 4 y 2 — cjr = 0
( 1 4-4)
The differential equation of the family (14-4) can l)e obtained by differentiating (14-4)
with lespeet to x and eliminating the parameter
c between (14-1) and the equation that results
from the diffcientiation
The reader w ill check that the differential equa-
tion of the family (1 1-4) is
dy
2xv ' 4 x 2
dx
V
0 .
Hence, the differential equation of the family of
curves orthogonal to (14-4) is
dx
2 xy
dy
x 2 4 y 2 “ 0.
This is a homogeneous differential equation whose
solution is found to be
x 2 4 y 2 — cy *= 0.
Thus, the desired family of curves is a family of circles with centers on the y axis (see
Fig. 8).
BBC. 15]
GEOMETRY AND THE FIRST-ORDER EQUATION
33
PROBLEMS
Sketch the following families of curves, find the orthogonal trajectories, and add them
to your sketch:
1. X* + y 2 - o J ;
2. xy
3. y « cx n ;
. **
5 . r ** c;
6. r »
7. r m c( 1 — cos#);
8. r *
-cfl.
V
I — e cos 9
9. If a and b are constant and X a parameter, show that the family of curves
a* + X + 6 J + X
satisfies an equation, free of X, which is unaltered when y f is replaced by — 1 /y'.* What
does this indicate concerning the orthogonal trajectories?
10. Fmd the algebraic equation, the differential equation, and the orthogonal trajec-
tories for the family of circles tangent to the y axis at the origin. Verify your result by
plane geometiy. (The configuration is a special case of so-called bipolar coordinates.)
15. Parabolic Mirror. Pursuit Curves. Besides the problem of finding
orthogonal trajectories, many other questions in geometry lead to first-
order differential equations. The following examples show how geo-
metrical conditions of this sort stem from physical conditions. Ine first
is taken from optics, the second from
the theory of pursuit.
Example 1. Find a mirror such that
light from a point source at the origin 0 is
reflected in a beam parallel to the x axis.
Let the ray of light. OP strike the mirror
at P and be reflected along PR (Fig. 9).
If PQ is the tangent at P arid or, 0, <£, and
0 are the angles indicated, we have a » 0
by the optical law of reflection and a * <t>
by geometry. Hence 0 *» <f>. The equa-
tion
2 tan <f>
1 — tan 2 4>
gives
tan 0 « tan (0 4- <j>) « tan 2 <t>
v ^
x 1 - (V) 2 *
since y ' « tan 4>. Solution of this quadratic equation for y ' gives
, ~ X dc Vr 2 •+* y 2
y
xdx 4- y dy
dt Vx 2 4- ^
dx.
whence
34 ORDINARY DIFFERENTIAL EQUATIONS {CHAP. 1
The left-hand member of this is an exact differential, and we get, on integrating,
±v?T7 «* x 4* c,
which, on squaring, yields y 2 «* 2cj -f c 2 . The curves form a family of parabolas with
focus at the origin.
Bxample 2. A boat A moves along the y axis with constant speed a. Find the path of
a second boat B which moves in the left-hand half of the xy plane with constant speed
b and always points directly at A .
At a time t inin after A is at (0,0), we shall have A at (0,ctf) and B at (x,y), say. Since
the line AB is tangent to the path of B, the slope of this line equals the slope of the path,
so that
y — at dy
x — 0 dx
or xy' y *=> —at.
To eliminate t , we first differentiate (15-1) and obtain
(15-1)
C xy ' - vY - xy” * -o~*
dx
(15-2)
Since ds/dt « b, where « is an arc on the trajectory, we have
dt dt ds 1 / -r
dx ds dx b T y
(15-3)
With r defined as a/6, substituting (15-3) in (15-2) yields
xy" « — rVT "+V 5 , t - p
b
(15-4)
which is reduced to a separable equation of first order by letting p *» y'
The solution is
,' = P » a inh(rloK-) --[(-)-(-)]
as in Sec. 13.
(15-5)
and from this, y is found by integration.
PROBLEMS
Find the curves in the xy plane which satisfy the following conditions:
1. (a) The tangents pass through the origin; (6) the normals pass through the origin.
2. (a) The segment of tangent between a point on the curve and the x axis has unit
length; (6) the projection on the x axis of this segment has unit length.
3. (o) The area bounded by the curve, the x axis, and the ordinate equals the ordinate;
(6) the area equals the length of the curve from (0,1) to (x,y).
4 . Find the path of a small boat in a wide river with uniform current if the boat has
constant speed relative to the water and always heads toward a fixed point on the bank.
6. Solve Example 2 completely under the assumption that A is at (0,0) and B is at
(*o,0), at time f « 0. Distinguish the cases r «■ 1 and r & 1. If r < 1, at what point
and when does B overtake A? If r « 1, how close can B get to A ?
16 * Singular Solutions. It was remarked in Sec. I that a differential
equation may possess singular solutions, that is, solutions which cannot
SEC. 16] GEOMETRY AND THE FIRST-ORDER EQUATION 35
be obtained from the general solution by specifying the arbitrary constants.
For investigation of this phenomenon let the family of integral curves
defined by
4 >(*,y>c) ~ 0 (16-1)
be the general solution of the first-order equation
F(z,y>y') * o. (16-2)
Assume that the family of curves (16-1) possesses an envelope, that is, a
fixed curve C such that every member of the family is tangent to C and
such that C is tangent, at each of its points, to some member of the family.
At a point (x,y) on the envelope, the values x , y , y ' for the envelope are
the same as for the integral curve, and hence these values x, y , y' satisfy
(16-2). Thus an envelope of a family of solutions is again a solution.
In general, the envelope is not a curve belonging to the family of curves
defined by (16-1), and hence its equation cannot be obtained from (16-1)
by specifying the value of the arbitrary constant c. It is known from cal-
culus that the equation of the envelope is obtained by eliminating the
parameter c between the equations
<t>(x,y,c) « 0 and 4> c (x,y,c) = 0,
where <f> c &* d<t>/dc.
Example: The family of integral curves associated with the equation
is the family of circles
y 2 (y') 2 + v 2 »
(x - c) 2 + y 1 - a 2 .
(16-3)
(16-4)
The equation of the envelope of the family (16-4) is obtained by eliminating c between
(16-4) and <t> e ■* ~2{x — c) «* 0. There results
V - zka, (16-5)
which represents the equation of a pair of
lines tangent to the family of circles (16-4)
(Fig. 10). Obviously, (16-6) is a singular
solution of (16-3), for it is a solution, and
it cannot be obtained from (16-4) by any
choice of the constant c. On referring to
Sec. 1, it is easy to check that the condi-
tions of the theorem ensuring uniqueness
of the solution are violated in this example.
PROBLEMS
1. (a) Show that y — c *» (as — c) ! represents a family of congruent parabolas with
vertex on the line y *■ x, and sketch. (6) By differentiating with respect to c obtain
the envelope y « % — (c) By direct computation, verify that the parabolae and the
ORDINARY DIFFERENTIAL EQUATIONS
[CHAP. 1
envelope have the same slope at corresponding points, (d) Obtain a first-order differen-
tial equation for the family, and (<?) verify tirnt y « x — M is a singular solution of this
equation.
2. A particle on the z axis has velocity v = \^s, where 8 is the distance to the origin.
Show that the motion is uniquely determined if the particle is at any point other than
the origin but that infinitely many different behaviors can occur if the particle ever
reaches the origin.
S. (a) Obtain the equation yy' + y' 2 -f x » 0 for the orthogonal trajectories of the
family y «* cx + 1/r. (b) Show that y » 2 \/z is the envelope of the family. ( c ) At
points of the curve y ** 2 Vs find the slope of the solutions of yy' + ( y f ) 2 + x « 0 in
terms of z. Then find the slope of the curve y * 2-\/x in terms of x How are these
two slopes related? Why? (d) Sketch the family, the envelope, and the orthogonal
trajectories in a single diagram.
17. The General Behavior of Solutions. The foregoing paragraphs in-
dicate* that from suitable geometric conditions on a curve, one can obtain
a differential equation for the curve. Now, in this section the point of
view is to be reversed. Starting from the differential equation we obtain
certain geometric conditions, which enable us to describe the solution
qualitatively even when the equation itself cannot be solved.
The function f(x 9 y) in the general first-order equation
dy
7 - (17-1)
ax
gives the slope of the solution curve at each point (x,y). Hence the solution
curves are increasing functions of x in regions of the xy plane in which
/(#,#) is positive and decreasing in regions where f(x,y) is negative. For
continuous/^,?/) the boundary between these regions is part or all of the
curve
f(*>y) * 0. (17-2)
Equation (17-2) gives the locus of the critical points, and their character
(maximum, minimum, neither) is shown by the sign of f(x,y) at neighbor-
ing points. The inflection points and sense of concavity are similarly
found from
y" -U+Uv’ =/*+/*/, (17-3)
where f x ss df/dx and f v ss df/dy.
For more detailed information one can plot the curves
/(*,!/) * c, (17-4)
called isoclines . At any point (x } y) where (17-4) holds, the solution curve
approximates a straight-line segment of slope c, a fact which can be used
as a check on the qualitative information obtained from (17-2) and (17-3).
From this viewpoint (17-1) is equivalent to a direction field in the xy
plane as discussed in Sec. 1. Any curve whose tangent at each point has
37
SB€. 17] GEOMETRY AND THE FIRST-ORDER EQUATION
the direction of the field is a solution, and conversely. The isoclines
(17-4) and the direction field discussed in Sec. 1 lie at the basis of some
methods for numerical solution of differential equations.
A technique of obtaining approximate solutions, based on a comparison
idea, was developed bv S. A. Chaplygin for equations of first and higher
orders (see Fig. 11). Let y\(x), y(x), and 7/2 M be solutions of
dy 1 dy dyo
7 = /(*,»), -p -/«(*,» 2). (17-5)
ax ax dx
By subtraction, the difference y — y\ satisfies
d
— (y - yi) * K*,y) - v = y(*), 2/1 * yi(*). (17-6)
ax
Now, if /(x,?y) > f\(x,!h) in a range of x, then y — is an increasing
function of x in that range. In this case the condition y ~~ y x = 0 at some
point x 0 ensures that ?y — */i > 0 for x > x 0 and y — y\ < 0 for x < x 0 .
Similar remarks apply to y 2 — Hence the conditions
> f(x,y) > h(x,y),
(17-7)
2/1 (Xo) = 2 /(*o) = 3/2(10),
in (17-5) enable us to conclude that
2/i (x) > y(x) > y 2 (x),
Vi(x) < y(x) < y 2 {x),
One chooses /i (x,y) and f 2 (x,y) in such a way that the solutions y u y 2 are
obtainable by elementary methods. Equation (17-8) then gives an explicit
estimate for y(x).
x > x 0l
x < x 0 .
(17-8)
A refinement of these ideas leads to an explicit and important inequality for estimat-
ing the error in certain approximations. Let y(x) be an exact solution of y' — f(x,y)
ORDINARY DIFFERENTIAL EQUATIONS
38
[chap. 1
through the point (xo,y 0 ), and let y\(x) be an approximate solution through this point.
Substituting y x {x) into the equation gives
rg-fanMl + H*}.
(17-9)
where the error term e(x) arises because yi is not an exact solution. Now, what can be
said about the solution error | ij\ ~ y\ in terms of the substitution error e(x)?
To answer this question we suppose that f(x,y) is continuous in a region containing
(xo,y 0 ) and satisfies a so-called bipschitz condition there; that is,
\f(x,y{) -f(x t y)\ < k\yi - y\, k const, (17-10)
for some k and all x, y, and y\ in the region. The condition (17-10) stipulates that
f(x,y) shall not change too rapidly when y changes. In case f y exists, the mean-value
theorem gives
f(*>y) -f(x,vi) - fv(x,Z)(y - vi), y <k <yu (17-11)
and hence (17-10) holds in any region throughout which
\fv(x,y)\ <k. (17-12)
Returning to the original question, in (17-9) let E( x) be the error in y Xi
E(x) » yi (x) - y(x). (17-13)
Since y(x) is an exact solution, we have y f » f(x,y ), and hen<*e, subtracting from (17-9),
dih dy dE
^ - f(x, Vl ) - f(x,y) + e(x). (17-14)
dx ax dx
If (17-10) holds, and if | e(r) | < m, then (17-14) leads to
dE
dx
< \f(x,yi) - f(x,y) 1 + | e(x) |
< k\y\ — y\ + m *= k\ E(x) | -f m. (17-15)
If we could drop the absolute values in (17-15) and replace the < by
tain the linear equation
~ ** kE{x) + m.
dx
The solution with E(xq) ** 0 is
E(x) - ~ (f*' 1 -*. 1 - 1).
k
we should ob-
(17-10)
(17-17)
Now, it is plausible and can be proved rigorously that E(x) in (17-17) is the maximum
possible E(x) subject to (17-15), with x > xq. Hence the solution error E *» yi — y
satisfies
|»i(x) -y(x)\ ^(e*l’-*„l - 1),
{ m *» max | e(x ) |,
k «* Lipschitz constant,
(17-18)
where \x — xo| is used rather than (x — xo) to account for the case x < x 0 .
Equation (17-18) leads at once to a uniqueness theorem, for if y\(x) is an exact solu-
tion, then e(x) * 0 in (17-9), hence m * 0 in (17-18), and therefore ydx) « y(x).
Example 1 . Discuss the integral curves for the equation y' ** xy — 1 without solving
the equation.
GEOMETRY AND THE FIRST-ORDER EQUATION
SEC, 17]
39
The hyperbola xy « 1 is the locus where y' ** 0. If xy > 1, then y* > 0 and the
solution curves are increasing, but if xy < 1, they are decreasing. Hence, xy » 1 gives
a locus of minima in the first quadrant, maxima in the third quadrant. Since y f ** — 1
when x « 0 or y * 0, all integral curves intersect the axes at an angle of 135°. From
V" « xy' -{- y * x(xy - 1) -f y * y(x 2 -f 1) - x,
the curve is concave up if y > x/(x 2 1) and concave down when this inequality is
reversed. The curves have the appearance shown in Fig. 12.
Fio. 12
Example 2. If r/' *» sin xy , j/(0) ** 1, show that
e x * /r < y < e* t/2 ,
at least for 0 < x < 0.8.
A glance at the graph of sin u shows that
2
- u < sin u < u
x
for 0 < u < t/ 2, and hence
2
40 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
The solution of the given equatio therefore lies between the solutions to
, 2
y ** - xy, y « xy,
r
This gives the desired inequality for the range 0 < xy <; ir/2 . Since y < e*^ 2 , it suffices
to have
0 < xe** 12 < ir/2,
and this is true for 0 < x < 0.8.
PROBLEMS^
1. For the equation y ' * y/x — 1, (a) sketohlthe locus y' ** 0 in the xy plane, (b)
Indicate the regions in which y is increasing; decreasing, (r) When is y concave up?
Down? ( d ) At what slope do the solutions cross the axes? (e) Sketch the locus where
the solutions have slope 1, —1,2, —2, 5, —5. (/) Sketch the solutions as well as you
can. (g) Verify your work by solving the equation.
2. In what regions of the xy plane are the solutions of y' = sin ( x 2 + y z ) increasing?
Decreasing? Sketch.
3. Discuss the equation y ' ~ sin (x 2 -f y), y{ 0) * 2, by comparing with suitably chosen
simpler equations.
APPLICATIONS OF FIRST-ORDER EQUATIONS
18. The Hanging Chain. Let it be required to find the curve assumed
by a flexible chain in equilibrium under gravity (Fig. 13). With $ as arc
dx
Fig. 13
from the point x = 0, let the weight density of the chain be w($) lb per ft
and let the loading function be f(x) lb per ft. The equation of the curve
SBC. 18] APPLICATIONS OP FIRST-ORDER EQUATIONS 41
y y(x) will be obtained from the fact that the portion of chain between
0 and x is in static equilibrium.
Equating horizontal forces gives
T 0 - T cos 8 , (18-1)
where To is the tension at the lowest point and 8 the angle which the tan-
gent to the curve makes with the horizontal. Similarly, equating vertical
forces gives
w(s) ds + J Q f( x ) dx ~ T sin 8 (18-2)
since the weight of the chain-plus-load must be balanced by the vertical
component of T. Both (18-1) and (18-2) require that the function y{x)
be differentiable (so that 8 is well defined) and they use the fact that the
tension is tangential, for a flexible chain.
From (18-1) we have T - T 0 /cos ft, so that
T sin 8 = To tan 6 « Toy', (18-3)
the latter equation resulting from the definition of ?/ as slope. Substitution
in (18-2) gives
f w(s) ds + f f(x) dx = T 0 y'. (18-4)
J o Jq
When w and / are continuous, (18-4) may be differentiated with respect
to x, a procedure which leads to the differential equation for the curve
in view of the fact that
ds
w(s) — + f(x)
dx
Toy "
d r*
* — / w(s) ds
dx J 0
w(s) ds
ds
dx
(18-5)
Example Show that a uniform flexible chain acted upon by gravitational forces alone
assumes the shape of a catenary, and find the tension in terms of the height y .
Here f(x) * 0, w($) ® m, a constant. Since d*/dx » Vl + y 7 \ Eq. (18-5) gives
woVTTV* - T«y". (18-6)
This is a second-order equation, which can be reduced to one ot first order by the method
of Sec. 13. With p » dy/dx we have
and hence (18-6) becomes
dp dp dy
— - sec — —
dx dy dx
cV X 4 ~ P 2
(1M)
hnS' q
4
42 ORDINARY DIFFERENTIAL EQUATIONS
Hub equation is separable, the solution being
[chap. 1
c y - vTT? (18-8)
when the axis is so chosen that cy » 1 when p ® 0. Equations (18-1) and (18-8) now
give T » To sec 9 - To V 1 -f tan 2 0 * T 0 V 1 -f p 2 « Toq/ * tooy, which gives the
tension. Writing p dy/cte in (18-8) and separating variables yield
as the reader will verify.
To . u'qx
— cosh —
t^o T 0
PROBLEMS
1. A flexible weightless cable supports a uniform roadway weighing m lb per ft.
The tensions at the highest and lowest points are T and To, the roadway is 2a ft long, the
sag is b, and the length of the cable is 2s. If the cable is symmetric about the y axis
and has its lowest point at the origin, show that the equation of the curve is
u>oz 2
y at ,
2 T 0
and thus obtain the relations T 0 * woo 1 /2b, T *» u\){afb)\/ g 2 /4 -f b 2 ,
8 ~ f Vl -f (2bx/a?)* dx, a — a ~ 2b 2 /3a
Jo
as b 0 . Hint: \/ 1 -f u ~ 1 ~ m /2 as u — ♦ 0 .
2. One end of a flexible uniform telephone wire is b ft above the lowest point, at a
distance a ft from it measured horizontally and at a distance $ ft from it along the wire.
If u =® awo/To, in the notation of See. 18, show that u satisfies the transcendental equa-
tions (cosh u — l)/u ** b/a, (sinhu)/u ** a/a and hence by division the nontrans-
cendental vquation tanh (m/2) » b/a. Also find the relations To « woa/u ** wqs each m,
T — wqCi (cosh u) / u « wqs coth u for Tq and for T, the tension at the highest point.
The student familiar with infinite series will obtain simplified expressions by expansion
of the hyperbolic functions when u is small, that is, when the tension is large.
19. Newton’s Law of Motion. Newton’s second law' of motion states that
the time rate of change of momentum is equal to the impressed force. In
symbols,
— (™*>) = F, ( 19 - 1 )
at
where F = component of force in the direction of motion
m * mass of the moving particle
v *» ds/dt = velocity of the moving particle
It is supposed that the particle moves in a straight line, its distance from
some fixed point on that line being 8.
The differential equation (19-1) is quite general, since the force may
depend on the time t } on the displacement s T and, in the case of damped
motion, on the velocity v . Also the mass may be variable in some problems,
for example, those concerned with rocket flight or with high-speed electrons.
SBC. 19]
Since
APPLICATION S OP FIRST-ORDER EQUATIONS
d(mv) d(mv) ds d(mv)
S* a* V
dt ds dt ds
Eq. (19-1) may be put in the form
d(mv)
v
ds
F,
43
(19-2)
(19-3)
which gives an alternative statement of Newton's law. Multiplying
(19-3) by m leads to ( mv)d(mv)/ds = Fm, which may be written
d 1
*7 ” (mv) 2 » Fm. (19-4)
CIS A
This gives still another formulation.
In case m and F are known in terras of s only, m * m($), F » F(s), then
(19-4) may be solved completely:
Vzbrw) 2 - J/ 2 (mot>o) 2 = f F(s)m(s) ds. (19-5)
•'•O
If the mass is constant, (19-5) becomes
Yimv 2 — }imv § » / F(s) ds, (19-6)
since m(s) may be factored out of the integral. Then (19-6) is the law of
conservation of energy , for the left side of (19-6) is the change in kinetic
energy while the right side represents the work done when the particle
moves from s 0 to s. Thus the right side is the change in potential energy.
The steps leading to (19-6) are evidently reversible if F(s) is continuous,
and hence Ne wton's law is equivalent to the principle of conservation of energy,
when the mass is constant and the force ts a continuous function of position
only .
When F and m are known functions of 8 , it has been seen that one can
obtain a so-called first integral of the equations. If F is a known function
of l, the same is true; we have
mv — m 0 v o = f F(t) dt, (19-7)
Jto
by inspection of (19-1). And similarly, when F and m are known functions
of v, one can write d(mv) = m(v) dv + vm'(v) dv. Substitution in (19-1) and
separation of variables now give
m(v) + vm f (v)
[chap. X
44 ORDINARY DIFFERENTIAL EQUATIONS
The same process in (19-3) yields
8 — S 0
m(v) + vm'(v)
F(v)
vdv .
For several particles addition gives
XFi
(19-9)
(19-10)
With M as the total mass M «* 2m, and with V as the mean velocity, MV *
this may be written (d/dl)(MV) « F. Here F * 2 F % is the total force; but since the
internal forces cancel in pairs, by Newton’s law of equal and opposite reaction, F is also
the total external force acting on the system. The extension to continuous mass dis-
tributions is made by analogy, the equations being defined as the limiting form of those
for a set of approximating discrete distributions. Thus any point moving with the mean
velocity V satisfies Newton’s law in the form (19-1). It can be shown that this point
actually remains “inside” the body if the v t are suitably restricted, but some restriction
is necessary. Of course, if the masses are constant, then V =* dS/dt, where S is the posi-
tion of the center of mass, MS — 2m,s, In that case the center of mass itself follows
<1M).
Example 1. The force on a particle of mass m is proportional to its distance from the
origin and is directed toward the origin. Find a differential equation for its motion.
The force is ks if a is the distance from the origin at time t Since the force is directed
toward the origin, it has at all times a sign opposite to that of s. Thus k is negative,
and one may write k *» — mco 2 for some constant o>. Equation (19-1) will now give
d(mv)/dt « -Tito 2 * or, dividing by m and putting v ** ds/dt,
^j + «A-0. (19-11)
This is the equation for simple harmonic motion, an important type of periodic motion
that arises in many mechanical a nd electrical systems. The general solution of (19-11) is
s =* A cm (uit -f- B) (19-12)
as shown in Probs. 2 and 3 and in the Example of Bee. 21 . Hence the motion is periodic,
with period 2ir/w independent of the amplitude A and plmse B.
Example 2, A gun containing a bullet moves with nonnegative velocity v on a straight,
horizontal, frictionless track and points in a direction exactly opposite to that of the
motion. The mass of builet-plus-gun is m, and that of the bullet is —Am, where Am Is
negative. If the bullet is fired with velocity r relative to the gun, show that (v — r) Am
equals the momentum of the gun after firing minus the momentum of the bullet-plus-
gun before firing.
By (19-1) the momentum of the bullet-pl us-gun is constant, since there is (we as-
sume) no external force on this system as a whole. Hence
mv « (m -f Am)(t> -f Aw) 4* ( — Am)v&, (19-13)
where p + At> is the new velocity of the gun and v b of the bullet:
% - v - c. (19-14)
Computing (m + Am)(t> -j- Av) — nw from (19-13) and (19-14) gives the result.
SBC. 19] APPLICATIONS OP FIRST-ORDER EQUATIONS 45
Two remarks are in order. First, the “equation of continuity for momentum/' Eq.
(19-13), has been seen to follow from Newton’s law; it is not a new assumption* Second,
if one replaces (19-14) by
«6 * v -f Ai> - c (19-15)
(which is a justifiable alternative), the result is altered only by the second-order term
Av Am. Hence there is no change at all when the increments are replaced by differentials
as in the following example.
Example 3. A. rocket fires some of its mass backward at a constant rate r kg per sec
and at a constant speed c m per sec relative to the rocket. Show that the thrust devel-
oped is r(c — v) when the velocity of the rocket is v. If the rocket starts with velocity
eo and there is no other force acting, v vq 4 c log 2 when half the mass is used up.
With m and v the mass and velocity of the rocket at time t, the differential in momen-
tum d(mv) due to external forces is F dt by (19-1). That due to loss of a mass —dm at
speed c — v in the backward direction is d(rnv) « (c — v)( —dm) by Example 2; for
differentials, not increments, the result is exact. Thus is obtained a fundamental rela-
tion for rocket problems:
d(ntv) - Fdi - (c - v) dm. (19-16)
In the present case dm « —r dt, so that (19-16) gives
d{mv)
~~dT
F *f r(c — v)
(1 9-17)
after division by dt. Hence the effect of the rocket motor is to add r(c — v) to the force F,
and that is what was to be shown.
Substituting
m — mo - rt (19-18)
for m in (19-17) gives (mo — rt)(dv/dt) « rc -f F after slight simplification. Hence, by
separating variables,
v — tty * ( r + - ) log for constant F. (19-19)
\ r / m
Putting F ® 0, m « mo/2 gives the second result.
Example 4. Starting with velocity t>o an electron is accelerated for a distance s by a
constant electric field of magnitude E. What is the terminal velocity?
Let c be the velocity of light, so tha'. the mass m of the electron is given in terms of
its rest mass mo and its velocity v by
« - — r^W (19*20)
v 1 - ir/r*
If we write
■ sin 8.
V-l
cos 0
(19-21)
as we may for v < c, then ( 19 - 20 ) gives m « m 0 sec 8 and mv ** cm tan 6. Substituting
in (19-3) with F « Ee, where e is the charge on the electron, gives
F eE
de
(rmo tan (?)
- esc 8.
Hence sec 2 8(dd/ds) - {eE/rtw?) esc 8, and by integration
seE
sec 8 • sec 0o + — «
me
(19-22)
m
ORDINARY DIFFERENTIAL EQUATIONS
(CHAP. 1
where 0o refers to the initial value. For numerical calculation it is more efficient to use
the form (19-22) with trigonometric tables than to obtain v explicitly by (19-21).
PROBLEMS
1* A brick is set moving in a straight line over ice with an initial velocity of 20 fps.
If the coefficient of friction between the brick and the ice is 0.2, how long will it be before
the brick stops?
2. Find a value of a for which 8 « ci sin (at -f- Oi) is a solution of (19-11). Does this
expression have enough independent constants to be a general solution? Determine
Ci and cj in such a way that the displacement s is maximum at t » 0 and has then the
value A. Determine c\ and e? in such a way that the maximum displacement is A and
the maximum velocity occurs at t ~ 0.
3. Apply the transformations used in the derivation of (19-6) to obtain the appropri-
ate form of (19-6) from (19-11). Check by direct comparison with (19-6). Solve the
resulting equation by separating variables, and thus show that the solution obtained in
Prob. 2 includes every solution.
4 . Suppose the rocket in Example 3 is subject to a retarding force of magnitude
mg *+■ fa;, where g and k are constant. From (19-17) and (19-18) obtain a linear equation
for v as a function of t. Show that (mo — rt)~ klr is an integrating factor, solve for
«, and obtain the position s at time t from s ~ Jvdt.
6. The equation of a cycloid Is x «* 8 + sin 6, y * 1 — cos 0. Show that the arc «
from the lowest point satisfies $ 2 = 8 y t and deduce the equation 4c 2 = gr(«o — s 2 ) for
a particle sliding down the curve. By differentiation obtain the equation 4 d 2 s/dt 2 *
— g* t which shows that the motion is simple harmonic. What is the period?
6. In a microwave electron accelerator the field is E sin u>t, If an electron starts with
velocity t*>, find the maximum possible terminal velocity. Hint: The maximum occurs
when the time for passage is exactly *-/<»>, for an electron starting at time t «* 0. Use
(19-8).
20. Newton’s Law of Gravitation. Another law of Newton is the law of
gravitation, to which he was led in his attempt to explain the motion of
the planets. This law states that two bodies attract each other with a force
proportional to the product of their masses and inversely proportional to
the square of the distance between them, the distance being large compared
with the dimensions of the bodies. If the force of attraction is denoted
by F, the masses of the two bodies by m x and m 2 , and the distance between
them by r, then
ymim 2
where y is a proportionality constant, called the gravitational constant.
In the cgs system the value of y is 6.664 X 10~ 8 .
It can be established that a uniform spherical shell attracts a particle
at an external point as if the whole mass of the shell were collected at the
center (see Chap. 5, Sec. 14). Hence, by integration, the same is true for
a solid sphere provided the density is a function of the radius only. If the
47
SEC. 20] APPLICATIONS OP FIHST-OKDEB EQUATIONS
sphere is the earth, one can therefore write
F = m v(~) toward the earth, (20-2)
where r e * earth’s mean radius
r « distance from particle to center of earth
g = new constant called acceleration of gravity
Its value in the cgs system is approximately 980 cm per Bee per sec and
in the fps system 32.2 ft per sec per sec. Since the earth is not a perfect
sphere, and since the density varies from place to place, the value of g
depends slightly on location. One uses a plus or a minus sign in (20-2)
according as the positive direction is taken toward or away from the earth’s
center.
It can be shown that a uniform spherical shell exerts no force on a particle which is in
the hollow space enclosed by the shell (Chap. 5, Sec. 14). Hence the force on a particle
of mass m at distance r from the center of a sphere is
™ j 4ttu 2 p(u) du (20*3)
when the density of the material forming the sphere is a continuous function p(u) of the
distance u to the center The special case p(u) = 0 for u > r 0 gives the result for a par-
ticle outside the sphere, as discussed previously.
Equation (20-3) gives
F » mg ~ (20-4)
for a particle inside the earth if p is taken as constant In case the particle is close to
the surface, we have r ~ r*, so that either (20-2) or (20-4) takes the simple form
F * mg. (20-5)
The error in (20-5) is less than 1 per cent for heights up to about 20 miles.
Example 1. Neglecting air resistance, discuss the velocity of a particle falling toward
the earth.
The principle of conservation of energy combines with (20-2) to give
or, after carrying out the integration,
( 20 - 6 )
If the particle starts from rest at a very great distance, the velocity with which it strikes
the earth is
v e *» \^2gr e (20*7)
as we see by setting Vo « 0, ro « *>, r « r e in (20-6). This terminal velocity is also the
minimum velocity of escape for a particle which leaves the earth never to return. Since
is approximately 4,000 miles, we find from (20-7) that v e is nearly 7 miles per sec.
48
ORDINARY DIFFERENTIAL EQUATIONS
[CHAP. 1
Example 2. Obtain a differential equation relating the density and pressure in the
interior of a spherical star if each is a function of the distance to the center only: /> « p(r),
p ** p(r). Assume p(r) continuous.
Consider a column of material of unit cross section extending along a radius from
r to r, t the radius of the star. The pressure at the base of this column equals the total
downward force on the column due to gravitation. The differential force on an element
of the column from r m q to r * q + dq is given by
df - 7 — ^ £ jTVuVu) du J a <*>(?) dq,
in accordance with (20-3), and the total force is given by integration:
p(r) « F * f 9 4>(<j) dq f <*>(<?) dq .
Jr Jrs
Since the right side of Eq. (20-8) has a continuous derivative, so does p{r):
= - 7 —? [ 4tt u 2 p(u)du.
dr r y 0
Multiplying by r 2 /p(r) and differentiating again lead to
l(;S) + ' Wr2p ” 0 '
(20-8)
Example 3, Assuming conservation of energy for motion in a curved path, obtain an
expression giving the period of a simple pendulum.
Fig. 14
Let P denote the position of a pendulum bob suspended from 0, and let 6 be the
angle made by OP with the position of equilibrium OQ t as shown in Fig. 14. The work
required to change 6 to any other value a is the work required to raise the bob through a
vertical distance a cos B — a cos <*, if a is the pendulum length. With a chosen as the
angle for maximum displacement, so that v ■* 0 at 6 as, conservation of energy gives
1
2
2™
® » mga(coB B — cos a),
SEC. 20] APPLICATIONS OP FIRST-ORDER EQUATIONS 49
where m is the mass of the bob. Separating variables gives t in terms of $ and hence,
implicitly, 8 in terms of U In particular the time required for $ to increase from 0 to a is
T
4
%/lf:
ds
Zg J o vco s 0 — cos a
(20-9)
so that the period T depends on the amplitude a. This dependence leads to the so-called
circular error in pendulum clocks.
The identities cos 0 «* 1 — 2 sin 2 (0/2), cos a * 1 — 2 sin 2 (a/2) give
fa r do a
T - 2*/- / k m sin-- (20-10)
V g Jo Vk 2 - sm 2 (0/2) 2
If a new variable of integration 4> is defined by
k sin 4>,
( 20 - 11 )
them ^ ranges From 0 to r/2 when 6 ranges from 0 to a. Also by (20-11)
2k cos 4> d<t> 2VT 2 — sin 2 (0/2)
(IS jzszrzz-r- rr- ^rrr - CW>.
cos 0/2 \/l — fc 2 sin 2 </>
Substitution into (20-10) yields a so-called elliptic integral
d(f>
\/ 1 ~ k 2 sin' 2 (f>
k
a
mma ?
( 20 - 12 )
The advantage of (20-12) over (20-9) is that (20-12) has been extensively studied and is
available in tables A series expansion is easily obtained, by expanding the radical
for small a. The result is
r. + (I)V + QV + (11|)>,...].
The function
F(k,x) « f ~
Jo ^
d<t>
\/ 1 — k 2 sin 2 0
is called the elliptic integral of the first kind. See Chap. 2, Sec. 10.
PROBLEMS
1, A stone is thrown vertically upward with velocity 8 fps at time t 0. Using
(20-5), write an expression for the position and velocity at time t and also for the velocity
as a function of distance 8. Find the time at which the velocity is zero, and show that
the height is then maximum. Show' that the maximum height agrees with that obtained
by equating kinetic and potential energy, that is, with nigh ** mv o/2.
2 . A particle slides down an inclined plane, making an angle 0 with the horizontal.
If the initial velocity is zero and friction may be neglected, the component of force in
the direction of motion is F * mg sin 0, What are the velocity of the particle and the
distance traveled during the time f? Find the speed as a function of the vertical distance
fallen, and verify that the same result would be given by equating energies as in Prob. 1.
SO ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
At &ny given instant, show that the locus of the particles obtained for various fts is a
circle (Fig. 15).
3. Suppose the pressure in the atmosphere is a known function of the density, p » f{p).
Show that the height h at which the density has first dropped to zero from the sea-level
r*>
value po satisfies gh/{\ -J~ h/r 0 ) m j [/'( p )/ p ! dp, when (20-2) is used. Hence the de-
pendence of gravitation on distance introduces an effect which depends on h only, not
on/(p), and the effect is less than 0.02/i when h is less than 80 miles. Obtain an explicit
expression for h in the case of adiabatic expansion, p ** kp a . For what values of a does
this give a finite height? (For air a » 1 .5; the height thus obtained turns out to be about
18 miles.)
4 . A man and a parachute weighing w lb fall from rest under the force of gravity.
If the air resistance is proportional to the square of the speed v, and if the limiting speed
is vo, find the speed as a function of the time ( and as a function of the distance fallen s
Hint * (w/g)(dv/dt) — w — kv 2 .
5 . A projectile is fired, with an initial velocity vq, at an angle a with the horizontal
Find the equation of the path under the assumption that the force of gravity is the only
force acting ori the projectile. For what a is the range maximum? Describe the region
which is within the range of the gun. Hint: Find the envelope of the trajectories when
vo is fixed but a varies.
6. A cylindrical tumbler containing liquid is rotated with a constant angular velocity
about the axis of the tumbler. Show that the surface of the liquid assumes the shape of
a paraboloid of revolution. Hint: The resultant force acting on a particle of the liquid is
directed normally to the surface. This resultant is compounded of the force of gravity
and the centrifugal force, since pressure at the free boundary is zero.
7 . Water is flowing out through a circular hole in the side of a cylindrical tank 2 ft
in diameter. The velocity of the water in the jet has the value which it would attain
by a free fall through a distance equal to the head. How long will it take the water to
fall from a height of 25 ft to a height of 9 ft above the orifice if the stream of water is
1 in. in diameter?
8 . Water is flowing out from a 2-in. horizontal pipe running full. Find the discharge
in cubic feet per second if the jet of water strikes the ground 4 ft beyond the end of the
pipe when the pipe is 2 ft above the ground.
SBC* 21]
LINEAR DIFFERENTIAL EQUATIONS
51
UNBAR DIFFERENTIAL EQUATIONS
21. Linear Homogeneous Second-order Equations. An equation of the
form
V" + Pi(*)v' + V2 &)y =* 0, (21-1)
in which p\(x) and p 2 (x) are specified continuous functions of x in a given
interval ( a,b ), is called a linear homogeneous equation of second order.
From the existence and uniqueness theorem of Sec. 1 it follows that this
equation has a unique solution for every x = x 0 in (a, b), satisfying the
initial conditions y(x 0 ) « yo> 2/'(x 0 ) ~ Vo- Thus, the integral curve for
Eq. (21-1) is determined uniquely when the ordinate and the slope of the
curve are specified at a given point of the interval.
Equation (21-1) is called linear because its solutions satisfy the following
linearity properties:
1. If y yi(x) is a solution of (21-1), then y = cy i(x), where c is a
constant, is also a solution.
2. I f y — i/i (x) and y — y 2 (x) are two solutions of (21-1), then their
sum y ~ i/i (x) + y 2 (x) is also a solution.
It follows from these properties that the sum of any number of solutions
of (21-1) each multiplied by a constant is also a solution.
The proof that properties 1 and 2 hold is simple
Thus, suppose that y = yi(x) is a solution of (21-1); then the substitu-
tion in (21-1) gives an identity
y[ + Pi y'l + VzVi 32 o. (21-2)
We must show that
fa/i)" + pi (cy)' + P 2 ^yi) (21-3)
vanishes identically for every constant c. But since c can be taken outside
the differentiation sign, we can write (21-3) as
c(y[ + pm + p 2 yi) }
and this vanishes because (21-2) does.
This establishes property 1.
To establish property 2, suppose that y » yi(x) and y ** y 2 (x) are two
solutions of (21-1). Then
y% + pm + P2Vi 53 o,
yl + pm + ?2V 2 s o.
We must show that
( 2 /i + y 2 ) n + Pi(yi + y 2 ) f + P 2 (yi + 1 / 2 ) 35 0.
( 21 - 5 )
ORDINARY DIFFERENTIAL EQUATIONS
52
[chap, 1
Inasmuch as the derivative of the sum of two functions is the sum of the de-
rivatives, we can rewrite the left-hand member in (21-5) as
(Vi + PiVi + Mi) + (vl + PiVz + V2V2)i
and this vanishes by (21-4).
Let us suppose now that by some means we have obtained two solutions
yi(x) and y 2 {x) of (21-1). Then by the foregoing
y(x) ~ cmix) + c 2 y 2 (x) (21-6)
is a solution for every choice of the constants c x . We say that (21-6) is
a general solution of (21-1) provided that for a suitable choice of the con-
stants C{ the solution satisfies arbitrarily specified initial conditions,
2/(*o) = 2/o, */'(«** o) * iL (21-7)
To determine the restrictions on y\{x) and y 2 (x) ensuring that the solution
(21-6) is, indeed, general, we insert (21-6) in (21-7) and obtain two linear
algebraic equations,
cm(x 0 ) + e 2 y 2 (x o) * y 0 ,
(21-8)
cmM + c 2 y 2 (x () ) * y (h
for Ci and c 2 . The system (21-8) can be solved for Cj and c 2 (for arbitrarily
specified Xo, y 0) and y () ) if, and only if, the determinant
W{y u y 2 ) s
Vx(x) y 2 {x)
y[(x) y 2 (x)
9* 0
(21-9)
for every x = x 0 in the interval. If W(y lt y 2 ) = 0 for some value of x,
the constants c t cannot be determined for every choice of y 0 and y r 0 and the
solution (21-6) is not general. The determinant W(y X} y 2 ) is called the
Wronskian after the Polish mathematician G. Wronski, who deduced
the criterion (21-9) for the generality of solution (21-6).
The condition (21-9) is equivalent to the statement that the solutions
yi(x) and y 2 {x) are linearly independent. We say that yi(x) and y 2 (x )
are linearly independent if the identity
cm(x) + c 2 y 2 (x) m 0 (21-10)
can be satisfied only by choosing ci » c 2 = 0. When nonzero constants
Ci and c% can be found such that c x yi(x) + c 2 y 2 (x) e? 0, we say that y x (x)
and y 2 (x) are linearly dependent. In other words, linear independence of
yi(x) and y 2 (x) means that the ratio y 2 (x)/yi(x) is not a constant. But if
this ratio is not a constant, its derivative
Vm - V\V%
V\
(21-11)
SEC* 21] LINEAR DIFFERENTIAL EQUATIONS 53
is not identically zero. We note that the numerator in ( 21 - 11 ) is precisely
W(i /t,y 2 ) ** VMi ~~ ViV 2* We have shown that if the solutions yi and y 2
are linearly independent, then WO/ 1 , 1 / 2 ) 5 ** 0 for some value of x. Con-
versely, if W ( 2 / 1 , y 2 ) 1=5 0 for some x « x 0 , so that
Vi(. x o) Vn(* 0 ) =
V\ (*o) v'i(x 0 )
we can show that the solutions y x (x) and y 2 (x) are linearly dependent , for
we can choose nonzero constants c x and c 2 in (21-6) so that at the given
point x «= z 0 our solution satislies the initial conditions
y(*o) ’= 0, y'(x 0) - 0. (21-12)
But if ?/(x) in (21-6) satisfies these conditions, then i/(x) ss 0 because there
exists only one solution of Eq. (21-1) satisfying initial conditions (21-12)
and a solution ?y(x) = 0 obviously satisfies these conditions. We have
thus shown that the nonzero constants c* and c 2 can be found such that
C\y\(x) + c 2 y> 2 (x) ~ 0 for all values of x, and hence the solutions y x (x)
and ?/ 2 (x) are linearly dependent. 1
It follows from this that the problem of finding the general solution of
(21-1) reduces to the search for some pair of linearly independent solutions
y i(x), i/ 2 (x). ft should be remarked that no formula is available for the
determination of solutions of the general second-order linear equation
In the special instance when the coefficients p x and p 2 in (21-1) are con-
stants, the general solution, as we shall see in the following section, is
deduced easily.
Example: Verify that
y * q sin x + c 2 cos x
is the general solution of
v" + y « 0
and determine the particular solution such that
1/(0) - 1, y\ 0) - M. (21-15)
The fact that y x = sin x and y 2 «* cos x are, indeed, solutions of (21-14) is easily verified
by substituting y » sin a: and y == cos x m (21-14). Hence their linear combination
(21-13) is a general solution provided that the determinant (21-9) does not vanish. In
our case,
W(y h y 2 ) -
and thus (21-13) is the general solution. To determine the constants a such that the
solution satisfies conditions (21-15), we form the set of Eqs. (21-8),
ci sin 0 4" q cos 0*1,
cj cos 0 - c 2 sin 0 »
sm x cos x
cos x — sin x
(21-13)
(21-14)
1 See in this connection Prob. 0, Sec. 21.
54 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
from which it follows that c\ *■ es ** L Thus the desired solution is y m sin % 4
COS X.
PROBLEMS
i* Verify that y ** e* and y — e~ z are linearly independent solutions of y" — y « 0,
Also show that y\ ■» sinh x and y% * cosh x are a pair of linearly independent solutions
of this equation.
2. Show that y * cie* 4- is the general solution of y" — 3 y f 4 2y » 0, and find
the solution satisfying the conditions t/(0) ** 0, y'(0) *» 1. What is the solution satisfying
m - V'( 0) - 0?
8. Show that yi « l/x and 3/2 ** x 6 are linearly independent solutions of x 2 y" —
3 xy’ — by ** 0 if x ^ 0.
4. Verify that y * cix 2 4 cz* is the general solution of x 2 y" — 2 xy f 4 2y ** 0 if
x & 0, and find the solution such that y( 1) * 2 and y'(l) » 0.
8. Show that y ■* c.\e} x 4 cjx/? 2 * is the general solution of y" — 4</' 4 4y ** 0, and
find the solution for which y(0) «* 1, y'( 0) = 4. Also find the solution such that y(0) **
V'( 0) * 0.
0. Compute the derivative of Wiyuy*) * 1/13/2 ~ VzVi, where yi(x) and y 2 ,(x) are two
solutions of (21-1). Show that jqj/2 — yiyi 4 Pi(x)(yiV 2 — VtV\) ** 0 and dW /dx 4
Pi (x)W «* 0. Thus W(yx,y 2 ) ** WoC~^ x o Pl( * >d *, where Wo is the value of W(yi,yt) at
x « xo. Conclude from this that if W(y\,y 2 ) does not vanish at x «■ xo, it does not
vanish for any value of x. This result is known as Abel's theorem.
22 . Homogeneous Second-order Linear Equations with Constant Coef-
ficients. Consider the equation
y" + Pi V 9 + V 2 V * 0 (22-1)
with constant coefficients pi, p 2 . If we substitute
3/ - (22-2)
in (22-1) and note that y r = rwe mx , y" = m 2 e mx , we obtain the equation
(m 2 4 Pim 4 p 2 )e mx » 0,
or
m 2 4 Pim + p 2 = 0,
(22-3)
since e mx 9 * 0. Thus, if m in (22-2) is chosen as a root of the characteristic
equation (22-3), then (22-2) will be a solution of the given equation. The
roots of the quadratic equation (22-3) are
-Pi db Vp? - 4 p 2
m
2
If pi — 4p 2 > 0, there will be two distinct real roots, m
«* m 2 . In this event,
y = y **
(22-4)
ini and m
are a pair of linearly independent solutions of Eq. (22-1), since
SEC. 22} LINEAR DIFFERENTIAL EQUATIONS 55
is not a Constant when m x ^ m 2 . Hence if m x ^ ra 2 , the general solution
of (224) is
y * Cl e M i x + c 2 e m **. (22-5)
If pf — 4p 2 < 0, the roots (22-4) are conjugate complex numbers ,
mi »fl + bi, m 2 = a — 6i,
and the complex functions
2/i = e (a+6,)I , 2/2 = (22-6)
are linearly independent solutions of (22-1), We can write (22-6) in a
trigonometric form with the aid of Euler’s formula [cf. Eq. (17-3), Chap. 2]
e (a±bi)x 0<w( COR ( )X -k { s J n
so that ?/i = e ax (cos 6x + i sin fcr),
7/2 — e aar (cos br — z sin 6x),
(22-7)
are the complex solutions of (22-1).
We show next that when Eq. (22-1) with real coefficients p } and p 2 has
a complex solution of the form y = n + zV, then the real functions u and
v are solutions of this equation. Indeed, the substitution of y — u + iv in
(22-1) yields on rearrangement
(u" + p x u f + p 2 u) + z( v" + p x v' + P 2 V) * 0,
and this can vanish if, and only if,
u" + p x u' + p 2 u - 0,
v" + piv' + p 2 v = 0.
Thus y — u and y =* v satisfy (22-1).
Referring to (22-7) we see that corresponding to a pair of complex roots
m = a ± bi of th'» characteristic equation, we have a pair of linearly in-
dependent real solutions
7/i = e ax cos bx, t/ 2 ” c™ sin bx. (22-8)
It remains to consider the case when p\ — 4p 2 = 0. In this event the
characteristic equation (22-3) has a double root
~~Pi
m x = m 2 — — — *
and the foregoing method yields just one distinct solution y x *= e mx f with
m ** —pi/2. We can verify by direct substitution that another solution
is 1/2 xe mx , which is obviously linearly independent, since y 2 /y x 335
ORDINARY DIFFERENTIAL EQUATIONS
v
56
(chap* 1
x 9 * const. Thus, when the characteristic equation has a double root
m a* — px/2, the general solution of (22-1) is
y = (cj + c 2 x)e m * (22-9)
In the following section we deduce this solution with the aid of the useful
notion of differential operators, which will be of help in resolving the cor-
responding situation involving multiple roots of linear equations with
constant coefficients of order higher than 2.
We illustrate the foregoing discussion by examples.
Example 1. Find two linearly independent solutions of y" 4- 3 y' -f 2y ** 0, and thus
obtain the general solution. Referring to (22-3), we see that the characteristic equation
in this case is
m 2 4- 3m + 2 - 0, (22-10)
which, on factoring, yields
(to + 1)(to + 2) - 0.
Thus, the roots of (22-10) are wi ■* —l, *■ — 2, and hence the general solution is
y » Cie“" x 4* C 2 < ? ‘“ 2;c .
Example 2. Solve y" + -4- 5 v «* 0. The characteristic equation is
m 2 + 2m + 5 « 0,
and hence
m
-2 dt VT - 20
-1 dr 2 i.
Accordingly, the complex solutions are
Vx » «<-*+*>* t/ 2 - e ( ~ l ~ 2l)ar , ( 22 - 11 )
and by (22-8) the linearly independent real solutions are
yi = cos 2.r, y% ~ e~ z sin 2x. (22-12)
It should be remarked that for many purposes the complex form of solutions (22-11)
is just as useful as the real form (22-12).
Example 3. Solve y ,f 4* 2/ 4- y ** 0. The characteristic equation
m ? 4- 2m -f 1 *=0
has a double root m * — 1. Accordingly, a pair of linearly independent solutions of the
given equation is y\ » e~~ x , yi ** j-e“ x . By (22-9) the general solution is
K“(q4 C2x)e~~ x .
PROBLEMS
Find the general solutions of;
i. y” 4- 3j/' - 54 y » 0;
3 . y" - 2y' 4- y « 0;
A | /" + 4y - 0;
7. y" - 4* hy - 0.
2. y" - by' + 6y * 0;
4. y" — 4y « 0;
6. y" - 4y' + 4y « 0;
SBC, 23] UNEAR DIFFERENTIAL EQUATIONS 57
23, Differential Operators. We introduce a new notation for the deriv-
ative symbol and write D as d/dx and, more generally, D n m d n /dx 1t .
Thus, D sin x means ( d sin x)/dx ~ cos x, and D 2 sin x » (d 2 sin x)/dx 2
= d/dx (d sin x)/dx * —sin x . Since
dcu(x) du
« C —
dx dx
and
d(u + t>)
dx
du dv
dx dx
we see that
Dcu(x) *s cDu
and
IJ(u + v)
= Du + Dv.
(23-1)
Moreover, since
cT dy
d nJrl y
dx n dx
dx n+1 '
we can write
D n (Dy) =
■■ n n+ 'y.
(23-2)
We agree also that
(D + m)y e [)y + my.
If the symbol (D + Wj)(D + m 2 )?/, where m x and m 2 are constants, is
interpreted to mean that (D + ?^i) operates on (D + m 2 )y s (dy/dx)
+ m 2 y, we find that
(Z> + wj)fD + m 2 )y » [Z> 2 + (wi + m 2 )0 + mim 2 ]y. (23-3)
From the structure of the right-hand member of (23-3) it follows that
(D + m])(I) + m 2 )y — (I) + m 2 )(D + mi)y. (23-4)
Making use of these properties, we can write Eq. (22-1), namely,
d 2 y dy
~~ 2 ' + Pi — + P2V — 0, (23-5)
as (Z> 2 + PiD + p 2 )y « 0, (23-6)
in which the differential operator
D 2 + piD + p 2
behaves as though it were an algebraic polynomial.
We observe that this polynomial is identical with the polynomial in the
characteristic equation (22-3). On noting (23-3), we see that (23-6) can
be written in factored form as
(D — m x ){D — m 2 )y « 0, (23-7)
where rn x and m 2 are the roots of (22-3). Now, if m x ^ m 2 , the general
solution of (23-7), as shown in the preceding section, is
y mm + C 2 e m * X .
ORDINARY DIFFERENTIAL EQUATIONS
58
To obtain the general solution of (23-7) when mi
as follows: We set in (23-7)
(D - m)y « v
and obtain a first-order equation for v ,
(D — m)v — 0,
dv
or mv — 0.
dx
[chaf. 1
m 2 ® m, we proceed
(23-8)
(23-9)
Its general solution is v = cie mjr . The substitution of this in the right-
hand member of (23-8) yields the first-order linear equation for y
dy
my « Cie tnx t
dx
whose general solution is easily found from (10-8). We thus get the solution
y * c x e mx + c 2 xe mT y (23-10)
which agrees with (22-9).
Example 1. Find the general solution of y" -f 5 y' -f 6y ® 0, This equation can be
written as
(D 2 + 5D + 6)2/ - 0.
On factoring the operator we get
(D + 2 )(D + 3)2/ - 0,
and thus the general solution is
y =» cie"" 2 * 4- C 2 # , " 3x .
Example 2. Solve 2 /” — 4y' -f « 0. We write this equation as
(Z) 2 - 4Z> + 4)?/ * 0
or (D - 2){D - 2)?/ - 0.
Since the roots of the characteristic equation are equal, the general solution is
y « ae u + cjxe 2 *.
PROBLEMS
Solve:
1. - HD ~ M)V - 0;
3. (Z>* + D - 2)2/ - 0;
5. (D*42 D + 1)|/ - 0.
2. (D* - l)y - 0;
4 . (D - 3) J i/ - 0;
sec, 24]
LINEAR DIFFERENTIAL EQUATIONS
59
24. Nonhomogeneous Second-order Linear Equations. The equation
y" + Pi(z)y' + p 2 &)y - /(«), (24-i)
in which the right-hand member f(x) is a known continuous function, is
called nonhomogeneous. The existence and uniqueness theorem of Sec. 1
guarantees that this equation has one, and only one, solution satisfying
the conditions
y( x o) = yo, y'(x 0 ) = vi
whenever the coefficients Pi(x) are continuous functions. If y » u(x)
is any solution of (24-1) and y x (x) and y 2 (x) are linearly independent
solutions of the associated homogeneous equation
y" + Pi(*)y' + P 2 (x)y = 0, (24-2)
the general solution of (24-1) is
V « c x yi(x) + c 2 y 2 (x ) + u(x). (24-3)
The fact that (24-3) is, indeed, a solution of (24-1) follows upon substitut-
ing (24-3) in (24-1) and noting that
u"(x) + p x (x)u'(x) + p 2 (x)u(x) = f(x)
and that y = Ciy x (x) + c 2 y 2 (x) satisfies the homogeneous equation (24-2).
The proof that (24-3) is the general solution is virtually identical with the
proof in Sec. 21 for the homogeneous equation. 1
We shall see in Sec. 28 that a particular integral u(x) of (24-1) can al-
ways be determined whenever the general solution 2 of the associated homo-
geneous equation (24-2) is known . In special instances, however, particular
integrals of nonhomogeneous equations can be deduced without the
knowledge of the general solution of the homogeneous equation. This
vsimpler technique, based on judicious guesses of the probable forms of
particular integrals, is known as the method of undetermined coefficients.
It is applicable to linear equations with constant coefficients only when the
right-hand member f(x ) has certain special simple forms.
We illustrate the essence of this method by several examples and develop
it in greater detail in the following section.
Example 1. The right-hand member of
y" + 3 y' + 2y-2e* (24-4)
suggests that it probably has a solution of the form y * ae x , for the differentiation of
1 The only difference is that the termB yo — and yj — uo(xo) instead of yo and yo
now appear in the right-hand members of Eqs. (21-8)
* This general solution is often called the complementary function.
60 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
exponentials yields exponentials. Accordingly, we take y ~ ae* as our trial solution,
substitute it in (24-4), and obtain
os* 4- 3oe* 4- 2oe* - 2c*.
On dividing by e x we get
6a » 2 or a }4-
Thus, y ** \ie s is a solution of (24-4). The characteristic equation of the associated
homogeneous equation
V" + 3y' 4- 2y - 0 (24-5)
is
nr ^ 3m -f 2 *= 0
or
(m 4* l)(m 4- 2) =» 0.
Hence ft pair of linearly independent solutions of (24-5) is y * e~ x , y — e~ 2x , and the
general solution of the given equation (24-4) is y =* C\e~ x -f cae'"' 2 * 4*
If the solution of (24-4) satisfying the given initial conditions is required, the con-
stants a must be determined from these conditions. For example, if we seek the solu-
tion such that y( 0) » and y'( 0) =* 0, we obtain, on setting x « 0 in the general
solution,
— % *■ Cl + Cj -f H,
or ci 4- Oi — — 1.
Also, y' «■ — cic“* — 2 c 2 e~ 2 * +
and since y'(0) «* 0, we have
0 * —ci — 2 c 2 -f
cr ci + 2 cj *
We easily verify that ci — — c 2 = 4-46, and the desired solution therefore is
V - + He*.
Example 2. If we attempt to obtain a solution of
v" + 3y' 4- 2y « 2c"** (24-6)
by taking a trial solution y — ac~ x we get
ac“ x — 3ac~ x 4* 2ae~ x 2e“ x .
This gives a nonsensical result, 0 = 2e“ x . The reason that the trial solution of the form
y m ae~ x is not suitable in this case is the following: The homogeneous equation asso-
ciated with (24-6), as we saw in the preceding example, has a solution y * ac~ x , and the
substitution of it in (24-6) naturally makes its left-hand member vanish. In this case
we take the trial solution in the form y =* axe~ x . Then, y' «* m ~~ 9 — axe~ x } y” ■>
— as~* — ae~ x 4* axc“ x , and the substitution in (24-6) now yields
— 2a«“ x 4* axe~~ x 4- Sae^ - 3axe“* 4- 2 axe~* - 2e~*,
or ac*~ x *» 2«~ x .
Thus a ** 2, and a solution of (24-6) is y - 2ze“ x . The general solution of (24-6), there-
fore, is
y m ci«“ x 4- c*r u 4- 2xe~ x .
BBC. 24] UNBAR DIFFERENTIAL EQUATIONS 61
Example 3. Find the general solution of
y" 4 V + V « (24-7)
We recall (Example 3, Sec. 22) that a pair of linearly independent solutions of the asso-
ciated homogeneous equation is y » e~*, y *» zf~ x . Accordingly, neither y *■> oe~* nor
y » axe" 1 is suitable as a trial solution of (24-7). In this case we take the trial solution
V « ax 2 e~ x . (24-3)
We compute
y’ = 2axe~ x — a£ 2 e~ x ,
2/" «* 2a^~ s — 2axe~ x — 4- ax 2 e~*
and on making substitutions in (24-7) find
2oe~ x — 4axe~ T 4 ax 2 e~ x 4 4aj:s“ x — 2 ax 2 e~ z 4 ax 2 e~ x ** e ~*
or 2a« x « e~ z .
Thus, a » J^S, and from (24-8) 3/ *» ^x 2 c“ z is a solution of (24-7). Its general solution
is
y = cie~~ x 4 tyxer* 4 Vzx 2 e'~ r .
These examples suggest a procedure to be followed in obtaining particular
integrals of equations with constant coefficients of the type
y" + Pi?/ + P 2 ?y = Ae kx . (24-9)
The characteristic equation associated with (24-9) is
to 2 + pi m + p 2 = 0. (24-10)
If this equation has two distinct roots m = rrti and m = m 2 , then the
linearly independent solutions of the homogeneous equation
V " + Pi2/' + V 2 V = o (24-11)
are y = and ?/ = e” 1 * 7- . When Eq. (24-10) has a double root m 2 = m 1?
the linearly independent solutions of (24-11) are y = e w i x and ?/ =
Now, if A: in the right-hand member of (24-9) is not equal to either mj or
m 2 , Eq. (24-9) has a solution of the form y = ae kx . If A: is a simple root
of (24-10), then (24-9) has a solution of the form y = axe kx . When A; is a
double root of (24-10), the particular integral can be taken in the form
y * ax 2 e kx .
Similar considerations apply to equations of the form
y" + Ply' + P 2 V = Ao + A X X 4 h A n x n . (24-12)
If 1 pa 5 ^ 0, we can take the trial solution
y « a 0 + &\x + * * • + ®nX n • (24-13)
1 This means that m » 0 is not a root of the characteristic equation (24-10) and hence
(24-11) has no solution y — const.
62 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
The substitution of (24-13) in (24-12) then yields on comparison of like
powers of x on both sides of the resulting equation the values of the un-
known constants a*. If P 2 * 0, the characteristic equation (24-10) has
m m 0 as one of its roots. In this event the trial solution can be taken in
the form
y ** x(a 0 + aix H f* a n x n ). (24-14)
We illustrate the use of these rules by two examples.
Example 4. Find a solution of
y" + Sy f + 2y - 1 + 2x. (24-15)
Since p% -* 2 5* 0, we take the trial solution
y « oo + a } x, (24-10)
substitute it in (24-15), and find
3ai -f 2(ao -f «ix) 1 -f 2x.
On comparing like powers of x, we get
3a 1 *4 2ao ** 1, 2ai ** 2,
whence
a* ■» 1, oo *» — 1.
The substitution of these values in (24-16) gives the desired solution y « —I 4- x.
Example 5. Find the solution of
y" + 3 / = 1 - 9x 2 (24-17)
satisfying the conditions y( 0) « 0, y'( 0) * 1 .
Since p% ** 0 in (24-17), we seek a solution in the form
y * x(ao + ojx 4~ a 2 x 2 ), (24-18)
We compute
y' ** oo 4* 2ojx 4- 3 o2X 2 ,
y" ** 2oi 4*
and insert in (24-17). The result is
2ai 4- 6 oax 4* 3(ao 4- 2aix 4- 3aax 2 ) ■» 1 - 9x 2
or 2«i 4- 3ao 4- (602 4- 6«i)x 4- 9o*x 2 « 1 — 9x*.
Hence
2oj 4- 3oo ® 1,
O02 4" ** 0,
9os *• —9.
Solving these equations, we get
«2 -1, * 1, oo »
and the substitution of these values in (24-18) gives
y - x(-% -f x - x 8 ).
SEC. 24] LINEAR DIFFERENTIAL EQUATIONS 63
The characteristic equation for (24-17) is
m 2 -f 3m =* 0.
Since its roots are m * 0, m » —3, the general solution of (24-17) is
V “ O + c**-’ 1 + *( - H + x - I s ). (24-19)
To determine the constants so that the solution (24-19) satisfies the given conditions,
we compute
y'(x) - — 3c 2 e- s * - (% - 2x + 3i s ).
The conditions y( 0) * 0 and y'( 0) ■» 1 then demand that
n + fj « 0,
~3c 2 - H • 1.
Thus, C2 » — c\ * and hence the desired solution is
V m % ~ %€ ~ u -f x(~M + I - X 3 ).
We state in conclusion that the trial solution for the more general
equation
y" + Pi y' + P 2 P = e kx (A 0 + A IX H 1- A n x T ‘) (24-20)
can be sought in the form
V = e kT (ao 4- a x x 4 b a n x n ) (24-21)
if k is not a root of (24-10). If k is a simple root of (24-10), the trial solution
(24-21) must be multiplied by x and, if the root is double, by x 2 .
PROBLEMS
Obtain the general solution:
1.
y" -
W + 63/ =
e 4r ;
2. y"
4- 2 y' 4 y « x;
3 .
v" +
57/' 4- 6j/ *
«*;
4. y"
— 2i/' 4- y «* x;
6.
(O 2 -
- 1 )y « 5x -
- 2 ;
6 . (O 2
- 1)?/ - e 2 *(x -
- i);
7 .
(O -
1) 2 ?/ -
8 . (O 2
~ 6/> 4- 9)3/ «
c 3 *;
9.
0(0
+ «)lf - 3;
10. y”
+ 92 / - x 2 - 2 x
4- 1;
11 .
f"-
y **
12. y"
4 - ?/ *= x 3 -b x;
13 .
ll" -
W 4* 6 y -
ae** 1 *;
14. (O
-- 1) 2 t/ * c*(x -
i);
15 .
(O 2 -
- 5D + 6 ) 3 /
= 3x s + 4i - 2;
16 . (D 2
- 5D)y * 3x 2 •
f 4*
Obtain the solution for each of the following equations satisfying the given conditions:
17 . y" + by' + iy - 20c 1 , ?/(0) = 0, y'(0) 2;
18 . y" + y' = 1 + 2x, 3/(0) - 0, t/'(0) = 0;
18 . y" + y’ -0, y(0) - 0, j/'(0) - 0;
20. y" + 4j/' + 3j/ - x, y(0) = — 46, J/'(0) - %\
21. y" + 4 y' + 3y - 0, y(0) - 0, j/'(0) - 0;
22. j/" + 4y' + 3y - y(0) - 1, y'(0) - 0.
ORDINARY DIFFERENTIAL EQUATIONS
[CRAF. 1
25. The Use of Complex Forms of Solutions in Evaluating Particular
Integrals. The method of determining particular integrals of Eq. (24-9),
described in the preceding section, can be extended to equations of the
form
y" + piy' + P 2 V « Ae kx cos nx f (25-1)
V n + PiV f + PiV = Ae kx sin nx } (25-2)
in which k may be equal to zero. If we recall the formula
e %nx a cos nx + i sin nx,
it becomes clear that Ae kx cos nx and Ae kx sin nx are, respectively, the real
and imaginary parts of the function Ae ik+in)x . Now, if instead of Eqs.
(25-1) and (25-2) we consider the equation
y" + Piy ' + P2V « Ae {k+%n)z (25-3)
and obtain its solution y * u + iv, the real part u of such a solution will
satisfy Eq. (25-1) and the imaginary part v will be a solution of (25-2).
We illustrate this method of deducing solutions of equations in the forms
(25-1) and (25-2) by examples.
Example 1. Find a solution of
y" + y -» 3 sin 2x. (25-4)
Since e i2x ■» cos 2x + i sin 2x, we consider, instead of (25-4), the equation
y” + V ** 3(cos 2x -f- i sin 2x) =* 3e 2tx . (25-5)
The imaginary part of a solution of (25-5) is clearly a solution of (25-4). Equation (25-5)
has the form (24-9) with k « 2t, and since neither of the roots of the characteristic equa-
tion m 2 *f 1 *■ 0 is equal to 2i, we take the trial solution
y - ae 2l *.
Now y' « 2 iae 2tx ,
y” = (2 i) 2 ae 2lx «* ~iae 2vx ,
and the substitution in (25-5) yields
~4ae 2ix -f ae 2tx ** 3e 2t *.
Thus, a « — 1, and consequently y = -c 2u: is an integral of (25-5). The imaginary
part of — e 2tx is — sin 2x, and hence a solution of (25-4) is y « — sin 2x.
Example 2. Find one integral of
y” *f y * 3 cos x. (25-6)
Since e** ■* cos x -f t sin x, we consider
y" + y — 3(cos x -f i sin x) m Ze w f (25-7)
the real part of the solution of which satisfies (25-6). This time k in (24-9) is 4-t, and
65
SBC. 25 ] LINEAR DIFFERENTIAL EQUATIONS
since the roots of the characteristic equation are we take the trial solution
y « axe**.
From (25-8),
y' « ae** -f aixe**,
y" « 2aie^ - axe <*,
and the substitution in (25-7) gives
2aie tx ~ axe** ~f- axe** «» 3e**.
(25-8)
Thus, a *« 3/2i ** — %i, and therefore
y m — %ixe** m — %ix ( cos i + i sin x)
is a solution of (25-7). The real part of this solution is %x sin x, and we conclude that
y ** Y$x sin x is a solution of (25-6).
Example 3. Fmd a solution of
y" 4 2 y' 4 2y * e~~ x cos x. (25-9)
Since e~ * cos x is the real part of
e x (cos x 4 i sin x) ** e~~ r e tT « e x( ~ 1+ ‘\
we consider the equation
y" 4 2 y f 4 2 y * e* ( “ 1+,) . (25-10)
The roots of the characteristic equation
w 2 4 2m 4 2*0
are m ® — 1 db t, and since one of these roots appears in the exponent in (25-10), we
take the trial solution
- ox^~ 1+< >.
Then,
y' =» ae x( " 1+t > 4 ox( — 1 4 t)c x( ~ 1+t) ,
y" « 2a(-l 4 4 ax( — 1 4 i)*e x( ~ 1+i) t
and on making substitutions in (25-10), we find
2 cue x ^~ 1 ‘*' t ^ «=
1
so that
Thus an integral of (25-10) is
1
2 i
y «*» — xe x( 1+v) ** — : xe” x (cos x 4 t sin x).
2i 2t
The real part of this, >£xe~ x sin x, is a solution of (25-9).
The methods of this and the preceding section can be extended to
equations
v" + piy' + ? 2 V = /(*) ( 25 - 11 )
in which the right-hand member is a sum of several functions of the types
66 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
considered in these sections, for suppose that f(x) « f x (x) + / 2 (x), so that
(25-11) reads
V " + PiV' + PnV - /i(*) + / 2 (x). (25-12)
If we consider a pair of equations
y" + Piy' + p 2 y = Mx),
(25-13)
2/" + PiV 1 + P2V = / 2 (x),
and denote the solution of the first of these by y = tq(x) and that of the
second by y = %(x), then y = n x (:r) + n 2 (x) will be a solution of (25-12).
The proof follows at once on inserting y — n^x) + u> z (x) in (25-12). As
an illustration of the use of t his theorem we consider a simple example.
Example 4. Find one solution of
y" ~b y “ 3 cos x + 1 + 2e*.
We consider three equations:
y " -f y » 3 cos x,
y" + y - 1,
y" + y - 2e*.
A particular integral of the first of these, as shown in Example 2, is y ® %x sin x, and
solutions of the second and third equations are, respectively, y ** 1 and y » e*, as is
clear by inspection. Hence an integral of the given equation is y » %x sin i + 1 + c r .
PROBLEMS
Solve:
1. (Z> 2 - 32> + 2)y - cos 2x;
2. (Z) 2 *4* 4)y * cos 3x;
3. ( D 2 - 3 - H)!/ * — cos x — 3 sin x;
4. y" + 5y' -f 6y * 3e“ 2x + *' 3x ;
5. y " + 2y' 5y » e 1 sin 2x;
8. y" — y' — 6y = e 3x cos 3x;
7. (Z> 2 - 25)y - e 5x + x 2 - 4x;
8 . (D 2 + l)y 3 sin 2x - 9 cos 3x.
Obtain the solution satisfying the conditions y( 0) » 0, y'( 0) » 0 for each of the fol-
lowing:
9. y" ~ y - sin x; 10 . y" -f 2y' -f 5y = 0;
11 . y" — 2y' ® e“ x cos x; 12 . y" -f- y ® cos x + 1.
26. Linear nth-order Equations with Constant Coefficients. The results
of Secs. 22 to 25 are easily extended to nth-order linear equations
y in) + P\y {n ~ l) H f p„y = /(*) (26-1)
with constant coefficients. In dealing with such equations it is convenient
SEC. 26] LINEAR DIFFERENTIAL EQUATIONS 67
to make a systematic use of the operator notation introduced in Sec. 23
and write (26-1) in the form
(D n + Pi D n *~ l H b p n -\D + p n )y = f(x). (26-2)
The homogeneous equation associated with (26-2) is
(D n + V\D n ~ l + • • • + pn^D + p n )y » 0, (26-3)
and if one substitutes in it y = e mr , there results
(m n + pirn*-" 1 H b Pn-iw + Pn)c*” r = 0.
It follows that ?/ ~ e mx is a solution of (26-3) whenever m is a root of the
characteristic equation
m n + pim n ~ l H b Pn-\m + p n = 0. (26-4)
If this equation has n distinct roots
m ~ m u rn — m 2 , . . m ~ m n ,
then 2 / = e w > x , 2 / = c w ’ x , . . . , y = e m - x
are distinct solutions, and we can conclude (see Sec. 27) that
y = c,c m > x + c 2 c w 3 x H b c n c w « x (26-5)
is a general solution in the sense that the arbitrary constants c t in (26-5)
can be determined to satisfy the prescribed initial conditions
y(*o) = yo> y'(x o) = 2/o, • • i/ ln ” l) (zo) = (26-6)
Since the coefficients m (26-4) are real, the complex roots of (26-4) must
necessarily occur in conjugate pairs. Thus, if nil ~ a + to and m 2 —
a — to are a pair of such roots, the solutions corresponding to them are
y x = *>(*+&»)* a* e ax (cos + t sin 6x),
t/ 2 = f (a-bi)x _ e ax^ cos fa _ ^ g j n
As in Sec. 22, we prove that the real and imaginary parts of these solutions
yield a pair of linearly independent real solutions
y = e ax cos bx ) y * e ax sin bx.
When the roots of characteristic equation (26-4) are not simple, and if,
for example, the root mi has the multiplicity k , then corresponding to it
68 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
there will be a set of k distinct solutions, 1
! /i -*"»•, y 2 = xe^ x , y k = x k ~'e m i x . (26-7)
The proof of this assertion follows upon making obvious modifications in
the argument presented in Sec. 23.
We illustrate these statements by two examples.
Example 1. Find the general solution of the fourth-order equation
3^ - 2 r 4- 2 y" - 2y' + y * 0, (26-8)
or (D 4 - 2D 8 4- 2D 2 - 2D 4- 1 )y * 0. (26-9)
The characteristic equation for (26-8) has the structure determined by the operator in
(26-9). It is
m 4 - 2m 8 4* 2 nr - 2m 41=0.
On factoring this we get
(m 2 4- l)(m - l) 2 - 0.
Thus, there are two simple roots mi = t, m 2 = —i and the double root m3 ■* m 4 ** 1.
Solutions corresponding to these roots are
3/1 « e 4 *, 3/2 * c""" 3/3 - e x , 2/4 *
and the general solution is
y *= 4* 4 C 3 C X 4
This can be written in real form as
y «* Ci cos 3; -j- C2 sin r + (C3 4- (\r)e x .
Example 2. The equation
(D 4 4* 3D 3 4- 3D 2 4 D)y « 0
or D(IJ 4- 1 ) z y - 0
has the characteristic equation
m(m 4- l) 3 - 0.
Accordingly, the general solution is
y •» cie 0x 4* C 2 &~ x 4* r a zc“ x 4* r i i 2 e'" x .
An argument in every respect similar to that given in Sec. 24 yields the
result that when y = u(x) is any solution of Eq. (26-1) and y = c x y x (x)
+ C 2 y%(x) 4 h c n 2/ n (^) is the general solution of the homogeneous
equation (26-3), then the general solution of (26~1) is
V = c il/i (x) + C 2 y 2 (x) -| 1 - C„y„(x) + u(x). (26-10)
1 If the complex root m\ ■* a 4- bi is of multiplicity then corresponding to this root
and to its conjugate m2 « a — bi, there will be a set of 2k real solutions:
e ax cos bx, xe° z cos bx, . . jr*” 1 #** cos for,
sin bx, xe? x sin bx, . . . , sin bx.
SEC. 26] UNEAR DIFFERENTIAL EQUATIONS 60
The calculation of particular solutions u(x) by the method of undetermined
coefficients for functions f(x) of the type considered in Secs. 24 and 25
follows, with obvious minor modifications, the pattern of those sections.
Without further ado we illustrate the procedure by examples. 1
Example 3. Find a solution of
3 /"' 4- y " 4 2 y' -x J 43^41. (26-11)
The left-hand member of this equation contains no y (that is, pa * 0). On recalling the
statement made for Eq. (24-12), we take the trial solution
y « x(cio 4 d\x 4* oax 2 ),
On computing the first three derivative^ we obtain
y' » ao 4 2a it 4 Za^x 1 ,
V" “ 2aj -f 6a 2 x,
- 6a*
Substitution in (20-11) then yields
(2ao 4 2ai -f (>a 2 ) 4 (0a 2 4 4a pjr 4 0a 2 x 2 « x 2 -f 3a: 4 1-
Hence 0a 2 ~ 1,
0a 2 4 4a i = 3,
2ao 4 2ai 4 0a. 2 ** 1
ami we conclude that
a 2 ** a 1 =^ 2 , ao — -- 1 £.
Accordingly, y « r( — 4 Jyx 4 VfcJ 2 ) is a solution of (20-11).
Example 4. Obtain the general solution of
(/J 3 - 3D 2 4 2 l))y * 4 4- 60e 5x . (20-12)
The characteristic equation for (20-12) is
m 3 — 3m 2 4 2m «= m(m — l)(m — 2) ** 0.
Thus, the general solution is
y - ci 4 c 2 <4 4 c 3 c 2x 4- a(x),
where u(x) is some integral of (20-12). To obtain a(j) it is simpler to add the particular
integrals of
{D z - 3 D* 4 2 D)y • 4, (20-13)
(JO 8 - 3L> 2 4 2D)y - 60c 5 *. (2(4-14)
For the first of these we take a trial solution y » ax. We find on inserting it in (20-13)
that a w 2, so that p =» 2x is an integral of (20-13). The substitution of p » ae hr in
(26-11) yieltls, after simple calculation, a «* 1; hence p * e 5 * is a solution of (20-14).
Accordingly, an integral of (20-12) is y « 2x 4 c Cjr , and the desired general solution is
3/ =» ci 4 c 2 e z 4 c'3C 2x 4 2x 4 e 5iC .
1 A general method is presented in Sec. 28.
70 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
Example 5. Obtain one solution of (D 3 -f 2D *f 7 )y « —24e*cos2x. The right-
hand side is the real part of —24f T e 2ix . To solve
(D 3 + 2£> 4* 7)y - — 24e T e 2u? - -24f< l+fl) * (26-15)
try y « Substitution gives
(D* 4* 2D + 7)oe (1 + 2 * )z « ((1 -f 2i) 8 + 2(1 + 2?*) + 7]ae (1 + 2l) *
~ (-2 + 2i)ar (l+20z * -24e <l+ * l) *
where the last equality is stated because we want to obtain a solution of (26-15).
follows that
—24
-2 -f 2t
12 1 4 - i
1 — i 1 4* i
60 4 0,
It
and hence a solution of (26-15) is
y «* 6(1 4“ i)e {l ~* 2l)r •* 6(^(1 4“ *)(eos 2x 4~ ? sin 2x) (26-16)
Since — 24e z cos2x is the real part of ~24e (1+2t)x , a solution of the original problem is
found by taking the real part of y in (26-16). Thus,
y = 6e x cos 2 j — 6^ sin 2x.
PROBLEMS
Find the general solutions:
1. (D - 5)(2 1) + 3 )l)y - 0;
3. (/> 3 4- 3D 2 -f 3 D + 1 ) y - 0;
6. (Z) 3 - 2D 2 4- D)y ~ 0;
7. (i> 4 - k*)y - 0;
9. (D 3 — I> 2 4- 4Z))j/ * 4x 4- e*;
11. (O - 1)(Z> - 2) 2 ?/ - x 2 ;
13 . Find the solution of y"' 4~ 2y 1
y’( 0) * 0, iy”(Q) ~ -25/2.
2. (D 2 4- l)(/> 2 4- 21) + 5)v - 0;
4 . (D* 4- 8)?/ - 0;
6. (D 4 4- 3 D z 4- 3D 2 + D)y - 0;
8. (/) 3 - 3D 2 4- 4)?/ - 0;
10. (/> 4 4- Dv — 2 cos x;
12 . (D 4- 1)(D ~ l)(/> ~ 2)y - 1 - c J
y' — 2y « 2c " Sx 4- 4x 2 which satisfies #(0)
27. General Linear Differential Equations of nth Order. It is not dif-
ficult to extend the considerations of Sec. 21 to a homogeneous nth-order
linear equation
y in) + pi(x)y {n ~ 1) + p 2 (x)y in ~ 2) + f p n -i (x)y' + Pn{x)y = 0 (27-1)
with variable coefficients p t (j). Word-for-word repetition of the argument
used to establish properties 1 and 2 of Sec. 21 leads to the conclusion that
V * c t yi(x) 4- c 2 y 2 (x) H h c k y k (x ) (27-2)
is a solution of (27-1), for an arbitral choice of the constants c *•, whenever
2 /i(x), y 2 i x )> • • *> 2/ifcW is a set of solutions of (27-1).
A set of k such solutions is said to be linearly independent if the relation
CiV\{x) + c 2 y 2 (x) H f c k y k (x) as 0 (27-3)
holds only when c\ c 2 = •• * - c k =* 0. When a set of constants c*,
8KC. 27] LINEAR DIFFERENTIAL EQUATIONS 71
not all of which are zero, can be found such that Eq. (27-3) is true, the
solutions yi(x) are linearly dependent.
It foIloAvs from the existence and uniqueness theorem of Sec. 1 that
Eq. (27-1) has exactly n linearly independent solutions y\{x), . * y n (x),
so that
y = Vl(x) + C 22 / 2 M H h c n y n (x) (27-4)
is a general solution of (27-1). This solution is general in the sense that
the constants c t in (27-4) can always be found, so that there is a unique
solution of (27-1) for the arbitrarily specified initial values
y(*o) = 2/o, y'(x o) = 2 / 0 , •••. y (n ~"(xo) = */o n_1) . (27-5)
An argument analogous to that used to establish the condition (21-9) for
linear independence of two solutions leads to the result that the set of n
solutions { y t (x)), i ~ 1, n, is linearly independent if, and only if,
the Wronskian determinant
2/i
2/2
• • 2/n
t
/
Vi
2/2
• . 2/n
,/n-l)
„ .(n
Vi
v-2
• • 2/i
(27-6)
dot's not vanish for any x in the interval where solutions are sought.
In contradistinction to the case of linear equations with cor >tant co-
efficients, no formulas are available for solving general linear equations
with variable coefficients of order 2 or higher. Certain special types of
such equations, however, have been studied extensively, and as shown
in Chap 2, Sec 12, their solutions may be obtained as power series.
Just as in Sec. 24, we can show that if y = u(x) is any solution of the
nonhomogeneous equation
y (n) + Pi (•*•)//" ,J H h Vn- lW/ + Pn(-r)?/ = /(*), (27-7)
(hen y = r^y ,(.r) + r 2 y 2 (x) -f c n y„(x) + »(x) (27-8)
is the general solution of (27-7) whenever the are linearly independent
solutions of the homogeneous equation (27-1) The determination of
particular integrals of (27-7), as we shall see in the next section, is a straight-
forward process provided that the general solution of the associated
homogeneous equation is known.
Example 1. Show that the set of functions t/i » x, yz ** x 2 , y$ ** x 3 is linearly inde-
pendent if x 9 * 0. The Wronskian (27-0) for this set of functions is
1 2x 3 or
0 2 (u
2x 3 .
W(j i h y2,yz) *
72 ORDINARY DIFFERENTIAL EQUATIONS [CHAP, 1
Since W(gn,yi,w) does not vanish as long as x & 0, this set is linearly independent in
any interval that does not include / = 0,
Example 2. Test for linear independence yi » x 2 -f 2 x, » x 8 4* x, yz » 2x 8 — x 2 .
We compute the Wronskian for this set of functions:
W(y h yz,yz)
x 2 4* 2x x 8 *f x 2x 3 — x 2
2x 4* 2 3x 2 -h 1 6x 2 - 2x
2 Ox 12x — 2
0 .
Since W(yi,y 2 ,y a ) «* 0, the given set is linearly dependent. This implies that a set of
constants Ci, cq, cz ) not all zero, can be found such that
cm 4- r 2 ya 4 cm * 0.
Tliis, in turn, means that at least one of these functions can be expressed linearly in
terms of the remaining ones. In fact, it is easy to check that t/ a ** 2y<i — j/j.
PROBLEMS
Test for linear dependence the following sets of functions:
1. c~ x , l, e*, sinh x; St. 1, sin x, cos x;
3. x 2 — 2x 4- 5, 3x — 1, sin x; 4. (x 4- l) 2 , (x — X) 2 , 3x;
5. e* x , e bx , e cz , a ^ b t 6 c ^ a; 6. e lx , sin x, cos x;
7. e x , xe*, x 2 e*.
28. Variation of Parameters. We proceed to show that a particular
integral y « u(x) of every nth-order linear equation (27-7) can be cal-
culated by the so-called method of variation of parameters whenever the
general solution of the related homogeneous equation (27-1) is known.
To make the procedure clear, we first develop it for the second-order
equation
y" + Pi(*)y' + P 2 (x)y - /(*) (28-1)
and then extend it to the general case of Eq. (27-7). Let us suppose that
V = cxyi(x) 4- c 2 y 2 (x) (28-2)
is the general solution of the homogeneous equation
y" + V\ (x)y* + PzWy = 0. (28-3)
We shall attempt to find an integral of (28-1) in the form
V * + v 2 (r)y 2 (x) t (28-4)
obtained from (28-2) by replacing the constants c» by some unknown
functions v t (z).
If we substitute (28-4) in (28-1), we shall obtain one equation which
imposes a condition to be satisfied by two unknown functions v x (x) and
v#(x). Since one such condition does not determine the unknown functions.
LINEAR DIFFERENTIAL EQUATIONS
73
me. 28 ]
we need another equation relating v x and v 2 . We shall impose this second
condition in a way that would tend to simplify the calculation of v x and v 2 .
If we differentiate (28-4), we get
y' * (viVi + v 2 y 2 ) + (v[yi 4* v 2 y 2 ). (28-5)
Now, the calculation of y" will he materially simplified if Vi and v 2 are
chosen so that the expression in the second parentheses in (28-5) vanishes.
Accordingly, we set
v\yi + v 2 y 2 * 0 (28-6)
and take y ' ~ v\V\ + v 2 y 2 . (28-7)
Then y" » v x y[ + v 2 y 2 + v\y\ + v 2 y 2 . (28-8)
The substitution from (28-4), (28-7), and (28-8) in the original equation
(28-1) yields, on rearrangement,
vi(y" + VxVi + P 2 I/ 1 ) + v 2 (y 2 + V\V 2 + 7>2!/2> + v\y\ + v 2 y 2 * f(x).
(28-9)
But since y x and y 2 are known to satisfy (28-3), the expressions in the
parentheses in (28-9) vanish. We thus get
vWi + v 2 y 2 = /(*)• (28-10)
never vanishes inasmuch as y x and y 2 are linearly independent solutions of
Eq. (28-3).
The right-hand members of Eqs. (28-11) are known functions of x } and
on integrating them we obtain fi(x) and v 2 (x ). We can thus write an in-
tegral of (28-1) in the form
V *= y i(z) [ ——dz + y 2 {x) f ~ ~ — ~ dx ,
J W(yi,y 2 ) J W{y u y 2 )
obtained by inserting v x (x) and v 2 (x) in (28-4).
(28-12)
74
ORDINARY DIFFERENTIAL EQUATIONS
(chap. 1
Example 1. Find an integral of
a?y ' — 2xy' 4-2 y « x log x,
if x > 0.
(28-13)
It is easily checked that a pair of linearly independent solutions of the homogeneous
equation associated with (28-13) is y\ « i, 1/2 « x 2 . Thus its general solution is y »
C\x 4“ c*x 2 . Accordingly, we seek an integral of (28-13) in the form
y ** v\x 4- t^ax 2 .
(28-14)
On dividing (28-13) through by x 2 to roduce it to the standard form (28-1), we see that
f{x) « (log x)/x. Thus, Eqs. (28-6) and (28-10) yield
VjX 4 - V 2 X 2
tql 4“ v 2 2x
Solving these for v[ and v 2 we obtain
logx
“ j
x
0,
logx
Vl m
V 2
and thus
Pl ”-/
log!
dx,
log*
~2 >
' logx
e ?2 * j dx.
Integrating these and dropping integration constants (for any integral will do), we find
Vi » -3^(logx) 2 ,
V 2
— -(14- logx).
X
( 28 - 15 )
The substitution from (28-15) in (28-14) yields the desired integral of (28-13) in the form
V » -x{l 4- logx 4- TsKlog x) 2 J
Of course, we could have obtained this result directly from formula (28-12).
The foregoing procedure can be generalized to compute an integral of
y {n) + Pi(z)v in ~ 1) 4 b Vn~\{x)y f -f p(x)y = f(x). (28-10)
If y i(x),. . ,, y n (x) is a known set of linearly independent solutions of the
corresponding homogeneous equation, we seek an integral of (28-10) in the
form
V * v x (x)yi(x) + v 2 (x)y 2 (x) 4 b v n (x)y n (x), (28-17)
where the v % (x) are unknown functions. To determine them we form the
set of n — 1 equations by equating to zero the terms involving the v[ (x)
in the expressions resulting from differentiating (28-17) successively n — 1
times. The nth equation is got by inserting the corresponding values of
derivatives in (28-10). We illustrate the procedure by an example. 1
1 See also Prob. 5 at the end of this section.
SEC. 28 ] LINEAR DIFFERENTIAL EQUATIONS 75
Example 2. Find an integral of
111
v'" + y' ~ ^ V “ - t log *» * 1 * o. (28-18)
A set of linearly independent solutions of the corresponding homogeneous equation is
known to be 1
yi - V 2 x log X , y$ « x(log x) 2 . (28*19)
Accordingly, we take the integral of (28-18) in the form
y •* v ix 4 log x 4 - V 3 x(Iog a:) 2 . (28-20)
For the third-order equation the procedure just sketched yields the system of three equa-
tions:
V\Vx 4 HV% 4- v$ys « 0 ,
v’lVi 4 t» 2?/2 4 0 , (28-21)
viyi 4t^2 +W «/(*).
The reader will verify that, on setting /(x) « ( 1 /x 2 ) logx and noting (28-19), the sys-
tem (28-21) yields
» i- (log x) 3 , t >2 ~ (log x) 2 , ** - — log x,
and we can take
t>i - Hflogz) 4 , *2 - -H(iogx)*, vs - Kflogx ) 2
Substitution in (28-20) gives finally y * (x/24)(log x)*.
PROBLEMS
1 . Use the method of variation of parameters to find integrals of the following equa-
tions with constant coefficients,
(а) y' 4 3 y ^ x 3 ; (c) y" ~ 2 y' 4 y » x;
( б ) /' 4- V 4 by «■ e x ; (d) y ,n - 3 y' 4 2 y « 2(sin x - 2 cos x).
2. Find the solution of
^ 4 f\(x)y « / 2 (x)
by the method of variation of parameters, and compare your result with that of Sec.
10. The solution of the related homogeneous equation is obtained easily by separation
of the variables
8 . By the method of variation of parameters, find a particular integral of
d 2 y Sdy 5
~ y a log x,
dx 2 x dx x 2
whore the general solution of the related homogeneous equation is
Cl . b
V - — 4 c 2 x & .
x
4 See Example 1, Sec. 30.
70
ORDINARY DIFFERENTIAL EQUATIONS
[chap. I
4 . find the general solution of
d*y x dy _ 1
dx* 1 — xdx 1 — x ^
- x >
where the general solution of the related homogeneous equation is eje* 4- e*x.
6. Show that the formula corresponding to (28-12) for an integral of (28-16) is
y(x)
' X) V>{x) J
w i(y h yt,. . ,,y n )
~W(y h yt,.. .,y n )
f(x) dx,
where W(yi t p 2 >* . ,,y n ) is the Wronskian and Wi is the determinant obtained from W
by replacing the ith column by (0,0,0,, . .,1),
29. Reduction of the Order of Linear Equations. The method of vari-
ation of parameters can be used to reduce the solution of every nth-order
linear homogeneous equation to the solution of a linear equation of order
n — 1 when one solution of the nth-order equation is known. This matter
is of some importance in deducing general solutions of second-order linear
equations, because one integral of such equations can often be determined
by inspection.
Let yi(x) be a solution of
y" + Pi 0)2/' + p 2 (x)y = 0, (29-1)
bo that y = cyi(x) is a solution for any constant c. If we replace c by an
unknown function v(x) and seek a solution of (29-1) in the form
y = (2&-2)
we get, on differentiating (29-2),
y f * vy'i + v'y u (29-3)
y" « vyl + 2v f y[ + v"y x .
Substituting from (29-2) and (29-3) in (29-1) and noting that y\{x) is a
solution of (29-1), we get a separable equation
v”yx
for v(x ).
Separation of variables in
+ v'( 2 y\ + PiVi) « 0
(29-4) gives
so that
y i
-2 log yi-fpi dx.
(29-4)
Hence
y?
e~f p > ix .
We see that v'(x) 9 * 0, so that v 9 * const.
(29-6)
LINEAR DIFFERENTIAL EQUATIONS
77
sec. 29]
Integrating (29*5) we obtain
v » J yr Z e~~f p ' dx dx, (29*6)
so that the second linearly independent solution of (29-1), by (29*2), is
V2 * Vx (x) jyi 2 (x)e^S p ^ ) dx dx. (29*7)
We dispense with quite analogous calculations showing that the solution
of an nth-order linear equation can be reduced to that of a linear equation
of order n — 1 when one integral of the nth-order equation is known.
Example * The equation
with x 7 * Y% has an obvious solution vi 253 x. To determine another solution we set
y «* vx. The function v f determined by formula (29-6), is
v
*-<)*' dr
J x ~2 ( /2r-\ lag (2x— 1 )
I^'-J
dx
x
dx
X
Thus the second solution is y * vx ~ e 2x .
PROBLEMS
1, The equation x 2 y" + 2 xy' ** 0 has an obvious solution
is another solution, and thus find the general solution.
2 . One solution of
y n + :
- 2x
X 2 + X
2x 2 - 2x_
j 3 -f x
V
y
1 . Show that y
0
l/x
obviously is y » x 2 , Show that a second solution is y xc~~*.
3. A special case of Ixjgendre’s equation
(1 - x % )y" ** 2 xy' + 2y - 0
has an obvious solution y *» x. Obtain a first-order equation for a second linearly
independent solution of this equation, and solve.
78
ORDINARY DIFFERENTIAL EQUATIONS
[CHAP. 1
SO. The Euler-Cauchy Equation. An equation of the form
x*V (B) + ai* n ~V B '~ 1) + • • • + a n ^xy' + a n y = /(*), (30-1)
where the a t are constants, is usually called Cauchy’s equation, although
it was examined earlier by Euler. 'We show that by a change of the in-
dependent variable x, it can be transformed into an equation with constant
coefficients which can be solved by familiar methods.
If we set
x = e*, ( 30 - 2 )
dx dz
then — » e t and — e”**.
dz dx
On writing D « d/dz, we get
y f
dy dz
dz dx
e g Dy.
Also
dy' dy' dz
dx dz dx
W)e-
= e~ 2, (Z) 2 - D)y
- e~ 2 ‘D(D - 1 )y.
In a similar way we find
V (n> = c- n ‘D(D - 1 ){D — 2) • • • (D - n+ l)j/. (30-3)
From (30-2), x n = e"*, and the substitution from (30-3) in (30-1) there-
fore yields the equation with constant coefficients
\D(D - 1 ){D — 2) •••(£> — n + 1)
+ aiD(D — 1) ••• (£> — « + 2) -f 1- 4- a n ]y — /(«'). (30-4)
If a solution of (30-4) is denoted by y = F(e), then the solution of the
original equation, as follows from (30-2), is y = F( log x).
Example 1. Find the general solution of
*V" + *»'-»-* log I. (30-6)
Upon setting x «* «\ this equation becomes
0>0> - 1)0> - 2) + ^
or (D® — 3D 2 -f 3/) — l)y » ze*. (30-6)
The roots of the characteristic equation obviously are mi ** m% ** m* ■» 1 . Hence the
solution of the homogeneous equation is (cj -f caz + c»z 2 )e*.
79
SEC. 31] APPLICATIONS OP UNEAR EQUATIONS
Inasmuch as the characteristic equation has a triple root and the right-hand member
of (30-6) is a solution of the homogeneous equation, we take the trial integral in the form
(see Sec. 27)
y * azV.
The substitution of the trial integral in (30-6) shows that a *■ so that the general
solution of (30-6) is
y ■* (a 4 C 2 z 4 c&~)e* -f Kaz***.
Finally, the substitution of z » log x gives
y « ki 4 <*2 log x 4 r 3 (log x) 2 jx 4 H 4 j(log x) 4 ,
which is the desired solution of (30-5).
The general solution of the homogeneous equation
xV n) + ai* n ~'!j (n ~' ) -f h a ri ,.ixy' + a n y = 0,
associated with (30-1), can often he found by taking a trial solution y = x m .
This is illustrated in the following example.
Example 2. Solve x 2 y n + 2x?/ * 0 The substitution of y ~ x m yields the equation
m(m — l)x w 4 2 wx m =* 0,
or m(w — 1) 4- 2m -= 0.
Since m * 0 and m » —1 satisfy this equation, y =* x° ~ 1 and y - x ~ l are linearly
independent solutions of the given equation The general solution, therefore s
y ** ci -f C2X*' 1 .
PROBLEMS
Find the general solutions of:
1. x 2 //" 4- 4xi/' 4 2iy « log x; 2. xV" ~ 4x 2 </" 4* 5 xy r — 2y *■ 1;
3. jt 7 y” 4 y ~ x 2 ; 4. x 2 y " - 2xy * 4 2y - x logx.
By assuming a solution of the form y = x m solve:
6. xV' — 4xy' 4 6?/ «* 0; 6. x 2 y" 4 2xy' - n(r> 4 l)y ~ 0.
APPLICATIONS OF LINEAR EQUATIONS
31. Free Vibrations of Electrical and Mechanical Systems.
Sec. 19 that the equation
d(mv)
dt
F(SyV,t),
We saw in
(31-1)
stating Newton's second law of motion, is readily integrable when the
external force F is a function of the displacement s alone, when it is a
function of the velocity v alone, or when it depends only on the time t.
80 ORDINARY DIFFERENTIAL EQUATIONS (CHAF. 1
In this section we examine other types of this equation which are of cardinal
importance in the analysis of oscillating electrical and mechanical systems.
Throughout this discussion we shall assume that the mass m is constant,
so that Eq. (31-1) can he written as
d 2 s
m '^2 == F( - S > V ’^>
where v m ds/dt.
We begin our study with a simple mechanical system which is a proto-
type of more general systems that appear in the analysis of vibrations of
elastic structures.
Let it be required to determine the position of the end of an elastic
spring set, oscillating in a vacuum. If a mass M is applied to one end of
the spring whose other end is fixed, it will produce the elongation $, which,
according to Hooke’s law, is proportional to the
applied force F = Mg, g being the gravitational
acceleration. Thus,
F — ks = Mg ,
where k is the stiffness of the spring.
If at any later time t an additional force is ap-
plied to produce an extension //, after which this
additional force is removed, the spring will start
oscillating. The problem is 1o determine the posi-
tion of the end point of the spring at any subse-
quent time.
The forces acting on the mass M are the force of
gravity Mg downward, which will be taken as the
positive direction for the displacement y f and the
tension T in the spting, which acts in the direction
opposite to that of the force of gravity (Fig 10). Hence, from Newton's
second law of motion, 0
d y
M ~ « Mg - T.
dr
Since T is the tension in the spring when its elongation is 8 + y, Hooke's
law states that T « k(s + y), so that
d 2 y
M~ « Mg - k(s + y).
dr
But Mg « ks, and therefore the foregoing equation becomes
d 2 y
M — + ky^ 0.
dr
SEC, 31] APPLICATIONS OF LINEAR EQUATIONS
Setting k/M *» a 2 reduces this to
~ + a 2 y « 0 or (D 2 + a 2 )y » 0.
dr
81
(31-2)
Factoring gives (I)
general solution is
or, in real form,
at)(D + ai)y - 0, from which it is clear that the
y = Cie~ alt + c 2 c alt ,
y ~ A cos at + B sin at
The arbitrary constants i and i? can be determined from the initial
conditions. The solution reveals the fact that the spring vibrates with a
simple harmonic motion whose period is
"oifv
The period depends on the stiffness of the spring as would be expected —
the stiffer the spring, the greater the frequency of vibration.
It is instructive to compare the solution just obtained with that of the
corresponding electrical problem. It will be seen that a striking analogy
exists between the mechanical and electrical
systems. This analogy permits one to replace
a study of complicated mechanical systems by
the analysis of performance of mathematically
equivalent simple electrical circuits.
Let a condenser (Fig. 17) be discharged
through an inductive coil of negligible resist-
ance. It is known that the charge Q on a con-
denser plate is proportional to the potential difference of the plates; that is,
Q = CF,
L
Fro. 17
where C is the capacity of the condenser. Moreover, the current I flowing
through the coil is
and, if the inductance be denoted by L, the emf opposing V is L dl/dt ,
since the IR drop is assumed to be negligible. Thus,
82 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
Simplifying gives
7? + n Q -°’
which is of precisely the same form as (31-2), where a 2 « 1/CL, and the
general solution is then
t t
Q = A cos — 7 == + B sin — r— *
VCL VCL
The period of oscillation is
T = 2irVcL.
Note that we can make the inductance L correspond to the mass M of the
mechanical example and 1/C correspond to the stiffness k of the spring
32. Viscous Damping. Let the spring of the mechanical example of
Sec. 31 be placed in a resisting medium in which the damping force is
proportional to the velocity. This kind of damping is termed viscous damp-
ing.
Since the resisting medium opposes the displacement, the damping force
r(dy/dt ) acts in the direction opposite to that of the displacement of the
mass M . The force equation, in this case, becomes
or, since Mg * ks ,
d 2 y
M TJ ~ Mg
dr
<i 2 y
k{y + «) - r
Ay
dt
r dy k
dt 2 + M~dt + M y ~ °'
To solve this equation we write it in the more convenient form
d 2 y
dt 2
+ 26 — + a 2 y
at
0 ,
(32-1)
where 26 = r/M and a 2 — k/M. In this case the characteristic equation
is
m 2 + 26m + a 2 = 0
and its roots are
m = — 6 ± vV - o 2 ,
so that the general solution is
y * )t + ^(-b-s/b'-a* )t (32-2)
It will be instructive to interpret the physical significance of the solution
(32-2) corresponding to the three distinct cases that arise when b 2 — a 2 > 0,
SEC. 32) APPLICATIONS OF LINEAR EQUATIONS 83
b 2 — a 2 = 0, and b 2 ~~ a 2 < 0. If 6 2 — a 2 is positive, the roots m are
real and distinct. Denote them by m x and m 2y so that (32-2) is
y ~ (32-3)
The arbitrary constants c*i and c 2 are determined from the initial conditions.
Thus, let the spring be stretched so that y = d and then released without
giving the mass M an initial velocity. The conditions are then
when t ® 0 and
y ~ d
dy
dt
- 0
when t == 0.
Substituting these values into (32-3) and the derivative of (32-3) gives
the two equations
d « C\ + c>> and 0 = mic x + w 2 c 2 .
These determine
m 2 d mid
c x * and c 2 =
nil *" TWj — w 2
Hence, the solution (32-3) is
d
y a- (m x e mit — m 2 e m ' t ).
nil ^2
The graph of the displacement represented as a function of t is of the type
shown in Fig. 18. Theoretically, ?/ never becomes zero, although it comes
arbitrarily close to it. This is the
so-called overdamped case. The re-
tarding force is so great in this ease
that no vibration can occur.
If b 2 — a 2 = 0, the two roots of
the characteristic equation are equal
and the general solution of (32-1) be-
comes
y = + c 2 t).
If the initial conditions are y * d > dy/dt * 0 when £ = 0, the solution is
y = + 60-
This type of motion of the spring is called deadbeat . If the retarding force
is decreased by an arbitrarily small amount, the motion will become
oscillatory.
ORDINARY DIFFERENTIAL EQUATIONS
84
[chap* 1
The most interesting case occurs when b 2 < a 2 , so that the roots of the
characteristic equation are imaginary* Denote b 2 — a 2 by —a 2 , so that
m
-b dtz ia
and
y «
+ c 2 fi
(— 6—icr)(
e *%4 cos at + B sin at).
If the initial conditions are chosen as before,
y = d
dy
when t ** 0 and
- 0
when * 0, the arbitrary constants A and B can be evaluated. The result
is
y * de bt ^cos at H — sin *
which can be put in a more convenient form by the use of the identity
A cos 6 + B sin 6 s= a/a 2 + # 2 cos ^6 ~ tan” 1 — ^ •
The solution then appears as
y - - VaM- b 2 e- bi cos (orf - tan" 1 ~Y (32-^4)
a \ a/
The nature of the motion as described by (32-1) is seen from Fig. 19.
It is an oscillatory motion with the amplitude decreasing exponentially.
The period of the motion is T = 2ir/a.
An electrical problem corresponding to the example of the viscous
damping of & spring is the following: A condenser (Fig. 20) of capacity C
is discharged through an inductive coil whose resistance is not negligible*
SEC. 32 } APPLICATIONS OP LINEAR EQUATIONS 86
Referring to Sec. 31 and remembering that the IR drop is not negligible,
we find the voltage equation to be
or
Simplifying gives
V
L 7/2=0
(it
Q d 2 Q dQ
- + Z/-— + /2 —
C dt 2 dt
d 2 Q R dQ Q
~dt i + L It + CL
= 0.
= 0,
1 — os
i
and this equation is of the same form as that in the mechanical example*
The mass M corresponds to the induct-
ance Lj r corresponds to the electrical
resistance R } and the stiffness k cor-
responds to 1/C. Its solution is the
same as that of t lie corresponding me-
chanical example and is obtained by
setting 2b = R/L and a 2 ~ 1/CL. p IG 20
-VWW 1
R
PROBLEMS
1. The force of 1,000 dynes will stretch a spring 1 cm A mas* of 100 g is suspended
at the end of the spring and set vibrating. Find the equation of motion and the frequency
of vibration if the mass is pulled down 2 cm and then released What a ill be the solution
if the mass is projected down from rest with a velocity of 10 cm per sec?
2. Two equal masses are suspended at the end of an elastic spring of stiffness k. One
mass bills off Describe the motion of the remaining mass.
3. The force of 98,000 dynes extends a spring 2 cm. A mass of 200 g is suspended at
the end, and the spring is pulled down 10 cm and released Find the position of the
mass at any mat ant / if the resistance of the medium is neglected.
4 . Solve Prob 3 under the assumption that the spring is viscously damped. It is
given that the resistance is 2,000 dynes for a velocity of 1 cm per sec. What must the
resistance be in order that the motion be a deadbeat?
6. A condenser of capacity 4 is charged so that the potential difference of the plates
is 100 volts The condenser is then discharged through a cod of Resistance 500 ohms
and inductance 0 5 henry. Find the potential difference at any later time i. How large
must the resistance be in order that the discharge just fails to be oscillatory? Determine
the potential difference for this case. Note that the equation in this case is
r dV V
+ , ‘li + c
0.
6. Solve Prob. 5 if R * 100 ohms, C * 0.5 nt, and L « 0 001 henry.
7. A simple pendulum of length l is oscillating through a small angle 9 in a medium
in which the resistance is proportional to the velocity. Show that the differential equa-
tion of the motion is
<Pe
dt 1
do g
+2k h + j 6
Q.
Discuss the motion, and show that the period is 2*7 Vw 2 — k? where w 2 m g/l.
ORDINARY DIFFERENTIAL EQUATIONS
[CHAP. 1
8. An iceboat weighing 600 lb is driven by a wind that exerts a force of 25 lb. Five
pounds of this force is expended in overcoming frictional resistance. What speed will
this boat acquire at the end of 30 sec if it starts from rest? Bint: The force producing
the motion is F » 25 — 5 * 20. Hence, 500 dv/dt * 20 g.
9. A body is set sliding down an inclined plane w r ith an initial velocity of t\> fps. If
the angle made by the plane with the horizontal is 0 and the coefficient of friction is p r
show that the distance traveled in t sec is
8 ** sin 0 — fx cob 0)t 2 4*
Hint: m dPs/dt 2 * mg sin 0 ~ pmg cos 0.
10. One end of an elastic rubber band is fastened at a point P t and the other end
supports a mass of 10 lb. When the mass is suspended freely, its weight doubles the
length of the band. If the original length of the band is 1 ft and the weight is dropped
from the point P, how far will the band extend? What is the equation of motion?
11. It is shown in books on strength of materials and elasticity that the deflection of
a long beam lying on an elastic base, the reaction of which is proportional to the deflec-
tion y, satisfies the differential equation
El
<?y
d?
~ky
8et o 4 «■ k/4EI, and show that the characteristic equation corresponding to the result-
ing differential equation is m A 4 4a 4 — 0, whose roots are m «* db a dr at. Thus show
that the general solution is
y — cie ax cos ax 4 C 2 e ax sin ax 4 W”** cos ax 4 c\e~~ ax sin ax.
12. If a long column is subjected to an axial load P and the assumption that the curva-
ture is small is not made, the Berrioulh-Euler law gives (see Sec. 5)
P
y
Fig. 21
c Py/dx 2 _ M
(1 4 (dyldxW m W
Since the moment Af is equal to - Py (Fig. 21), it follows upon setting
dy/dx » p that the differential equation of the deformed central axis is
yjdp/dy) ^ _ Py
(1 4 V) h ^ El
Solve this differential equation for p , and show that the length of the
central line is given by the formula
where k 2 «* d 2 P/4E/, d is the maximum deflection, and F(k,v/2) is the elliptic integral
of the first kind [see Eq. (20-12)]. The equation of the elastic curve, in this ease, cannot
be expressed in terms of elementary functions, for the formula for y leads to an elliptic
integral See, however, Chap. 2, Sec. 10.
S3. Forced Vibrations. Resonance. In the discussion of Sec. 32, it was
supposed that the vibrations were free. Thus, in the case of the mechanical
example, it was assumed that the point of support of the spring was station-
SEC. 33 ] APPLICATIONS OF LINEAR EQUATIONS 87
ary and, in the electrical example, that there was no source of eraf placed
in series with the coil.
Now, suppose that the point of support of the spring is vibrating in
accordance with some law which gives the displacement of the top of the
spring as a function of the time t, say x — /(<), where x is measured positively
downward. Just as before, the spring is supposed to be supporting a
mass M, which produces an elongation 8 of the spring. If the displace-
ment of the mass M from its position of rest is y, it is clear that when the
top of the spring is displaced through a distance x t the actual extension
of the spring is y ~~ x. If the resistance of the medium is neglected, the
force equation is
d 2 y
M — ~ Mg - k(s + y - t) » -~k(y - x),
dr
whereas if the spring is viscously damped, it is
d 2 y dy
dt 2 di
Upon simplification this last equation becomes
M
d 2 y
dt 2
dy
+ r — + ky ~ kx }
dt
( 33 - 1 )
where x is supposed to be a known function of t.
The corresponding electrical example is that of a condenser (Fig. 22)
which is placed in series with the source of emf and discharges through a coil
containing inductance and resistance. The voltage equation is
dl
-RI-L-+V -/(*),
dt
where /(/) is the impressed emf given as a function of L Since
88
ORDINARY DIFFERENTIAL EQUATIONS
the equation becomes
d 2 V dV
cl-~ t + cr--+ V
dt 2 dt
■m-
[chap. 1
(33-2)
An interesting case arises when the impressed emf is sinusoidal; for example,
f(l) = Eo sin wl.
Then the equation takes the form
d 2 V RdV 1
'd J + L dt + CL
1
V = — E o sin wt.
Both (33-1) and (33-2) are nonhomogeneous linear equations with
constant coefficients of the type
+ 2& •— + an/ = <rf(t). (.53-3)
dt~ at
The solution of this equation is the sum of the complementary function
and a particular integral. The complement ary function has the form
(32-2), namely,
CxC m d + Co'’ 7 " 2 *,
where
m x = — b + vV — a 2 and rn 2 ~ -b — \/b 2 — a 2 .
A particular integral y = u(t) can be deduced for Eq. (33-3) for an arbi-
trary continuous function /(/) bv the method of variation of parameters 1
If the impressed force /(/) in (33-3) is simple harmonic of period 2 v, w
and amplitude a [h then
/(/) = a 0 sin wl
and an integral y — u{t) can be obtained by the method of Sec. 25. The
result is
a 2 ao
yv> - v<? +«v * ( “‘ - <>- (33 - 4)
—1
where « = tan ~*
a — w l
From discussion in See. 32, it is clear that the part of the general solution
of (33-3) which is due to free vibrations is a decreasing function of t,
becoming negligibly small after a sufficient lapse of time. Thus, the 1 'steady-
state solution” is given by the particular integral (33-4).
* flee the corresponding computation at the end of this section for the case when b ** 0.
SEC. 33] APPLICATIONS OP LINEAR EQUATIONS 89
It must be observed that when the impressed frequency u> is very high,
the amplitude of the sinusoid in (33-4) is small. When w is nearly equal
to the natural frequency a, the amplitude is nearly a 0 a/2b. This may be
dangerously large if the resistance parameter b is small. For a and b
fixed, the maximum amplitude occurs when
~ ((a 2 — w 2 ) 2 + 46 2 w 2 ] — 0,
do)
that is, when
O. 2 = a 2 - 2b 2 . (33-5)
Stated in terms of the physical quantities of electrical and mechanical
examples, a large amplitude in (33-4) means a large maximum emf, or a
large maximum displacement of the spring. These, as we have already
noted, may become excessively large when the resistance r of the medium is
small and the impressed frequency w is close to the natural frequency a.
This phenomenon, known as resonance , is of profound importance in
numerous engineering and physical situations. 1
If b — 0, Eq. (33-3) reduces to
d 2 y
- - + a 2 y = a 2 f{t). (33-6)
at
We can easily deduce a formula for an integral y(t) of (33-6) for an rrbitrary
forcing function f(t). Since sin at and cos at are linearly independent
solutions of Eq. (33-6) with f(t) — 0, the method of variation of param-
eters of Sec. 28 suggests taking a solution in the form
?/(0 ~ ?>i (/) cos at T v 2 (l) sin at (33-7)
For the determination of v\ (/) and v 2 (f) we have a pair of equations [see
Eqs. (28-6) and (28-10)]
v\ cos at + v 2 sin at ~ 0,
— av\ sin at + av 2 cos at = a 2 /(f).
Solving these for v\ and v 2 we get
v[ == ~af(t) sin at, v 2 = af(t) cos at,
which on integration between the limits 0 and i yield
vM) =* —a[ f{t) sinatdt, v 2 (t) ~ af f(t) cos at dt
J 0 ■'0
1 The failure of the Tacoma bridge was explained by some authorities on the basis of
resonant forced vibrations, and there are instances of the collapse of buildings induced
by the rhythmic swaying of dancing couples. The failure of propeller shafts is often
attributed to forced torsional vibrations. Sec also Joshua 6:5.
00 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
Formula (33-7) then yields
y(t) « — a cos atj^f(\) sin a\d\ + a sin atf^f(\) cos aX dX, (33-8)
in which we have replaced the integration variable t by X so as not to con-
fuse it with the variable t in the limits. It follows directly from (33-8)
that the integral y(t) corresponds to the initial conditions y( 0) * y'(Q) * 0.
If we combine integrals in (33-8), we get the desired formula
y{t) a* a f /(X) sin a(t — X) d\. (33-9)
J o
When the forcing function /(0 is taken in the form f(t) — a 0 sin at (so that
the impressed frequency is equal to the natural frequency), this formula
yields
y{t) = aao f sin aX sin a(t — X) dX.
J o
After simple integration we obtain
y = — (sin a£ — cos at ),
2
representing a vibration whose amplitude increases with time, for the
amplitude a 0 /2 in the first term is constant and the amplitude of the
second term, a 0 at/2, grows with t. In any physical situation, some resist-
ance is present, and a reference to (33-4) shows that b prevents oscillations
from becoming arbitrarily large. Nevertheless, they may be dangerously
large if b is small and a is near w.
PROBLEM
Obtain a formula for a particular integral of Eq. (33-3) analogous to (33-9), and deduce
from it the result (33-4). The integration will be simplified if sin a >t is replaced by
(< e <wt - e~^)/2i.
34. The Euler Column. Rotating Shaft It is known from experiments
that a long rectilinear rod subjected to the action of axial compressive
forces is compressed and retains its initial shape as long as the compressive
forces do not exceed a certain critical value. Upon gradual increase of the
compressive load P, a value of P — P% is reached when the rod buckles
suddenly and becomes curved. The deflections of rods so compressed
become extremely sensitive to minute changes of the load and increase
rapidly with the increase in P. A detailed analysis of this instability or
buckling phenomenon depends on rather delicate considerations in non-
linear theory of elasticity. However, if the argument of Euler is followed,
SEC. 34] APPLICATIONS OP LINEAR EQUATIONS 91
it is possible to deduce the magnitude of the critical load P\ from linear
differential equations governing small deflections of loaded rods.
Thus, consider a rod of uniform cross section and length l, compressed
by the forces P applied to its ends (Fig. 23). Initially this rod is straight,
Fig. 23
but after the critical load Pi is reached, it becomes curved, and we denote
the deflections of its central line by y.
It is known from the Bernoulli-Euler law (5-2) that for small deflections
d 2 y __ M
d a* “ eT
where, in our case, the bending moment M - — Py . Thus
d 2 y Py
dh? = ~ W
or
y" + k?y - o,
(34-1)
where
MS
hi
*
(34-2)
Equation (34-1) must be
solved subject to the end conditions
3
V '
II
©
<<
s
il
p
(34-3)
since the ends of the rod remain on the x axis.
The boundary-value problem characterized by Eqs. (34-1) and (34-2) is
quite different from the initial-value problems considered heretofore. In
the initial-value problems we seek solutions of differential equations
satisfying specified conditions at one point only, while in the boundary-
value problem stated above the solution y must satisfy conditions (34-3)
assigned at two points x = 0 and x = l It is not obvious that a solution
of a differential equation satisfying specified conditions at two points
exists in general. We shall see, however, that for suitable choices of the
parameter k Eq, (34-1) does have solutions vanishing at the end points
X 0, x *■ l.
0 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
Now the general solution of (34-1) is
y = ci cos kx + c 2 sin kx (34-4)
and, on imposing the conditions (34-3), we get two equations
0 = Ci cos W) + c 2 sin kO
0 = Ci cos kl + c 2 sin kl.
rhese demand that
ci * 0, c 2 sin kl — 0. (34-5)
The choice c x c 2 = 0 gives y — 0, corresponding to the rectilinear shape
of the rod. If the rod does not remain straight, c 2 n* 0, and we conclude
from (34-5) that sin kl = 0, so that
7?7T
fc = y> n = 0,1,2,.... (34-6)
The choice of n *= 0 again gives y ~ 0. If n = 1, k = 7 r/Z, and on recalling
the definition (34-2), we see that the corresponding value of P is
Pi (34-7)
r
This is the critical , or the Euler , food.
The shape of the central line of the rod, in this case, is
TTX
y ~ c 2 siny*
The choice of n = 2, 3, ... in (34-6) gives other “critical loads’' P 2 , /V
. . . and the corresponding solutions
m tz
y - 02 sm ~ *
The maximum deflection c 2 is not determined in this analysis, and, indeed,
no far-reaching conclusions should be made from such calculations inas-
much as they are based on the assumption of small deflections implicit
in our use of the Bemouili-Euler law.
Another interesting problem, essentially of the same sort, arises in the
study of rotating shafts. It has been noted that when a long shaft sup-
ported by bearings at x = 0 and x = l is allowed to rotate, its initially
rectilinear shape is preserved only if the speed of rotation u> does not
exceed a certain critical value u>\. On approaching the speed o>x the shaft
starts pulsating and its shape changes. On further increase of the speed
another critical value « 2 is reached when the shaft starts beating and its
SEC. 84] APPLICATIONS OF LINEAR EQUATIONS 93
shape changes again, and so on. This phenomenon can, in part, be explained
by calculations similar to those used in determining the Euler load.
Let us suppose that the shaft is rotating with the angular speed u>. An
element of length dx of the shaft experiences the centrifugal force
F dx = pdx u 2 y }
where p is the density per unit length of the shaft and y is the deflection
at the point x. Thus,
F « pa> 2 y (34-8)
is the force per unit length of the shaft distributed along its length. It
is shown in books on strength of materials that when the forces F acting on
a rod are normal to its axis, then
F «
d 2 M
dx 2
where the bending moment M is given by the Bemoulli-Euler law
M - El
d 2 y
dx 2 '
(34-9)
Thus,
(34-10)
and if the flexural rigidity El is constant, Eq. (34-10) reads
d 4 y ^ F
dx 1 ~ In
(34-11)
The substitution for F from (34-8) gives the desired equation for the
rotating shaft:
d 4 y
-4 - Pjr = 0 (34-12)
V* *
dx 4
with
pet)
El
(34-13)
Since the roots of the characteristic equation m 4 — A; 4 = 0 are rn = ifc,
m » db kij the general solution of (34-12) is
y = cie kx e 2 e" kx + c 3 cos kx + c 4 sin kx . (34-14)
If at the points of support x ~ 0, x = l the deflection y and the moment
ill are zero, then [see (34-9)]
y( 0) - 0, y"(0) - 0,
2/(0 - 0 ,
2/"(0) - 0,
2/"(0 - 0 .
(34-15)
94
ORDINARY DIFFERENTIAL EQUATIONS
[CHAP. 1
The substitution from (34-14) into the boundary conditions (34-15)
yields four equations:
Ci + C 2 + C8 *= 0,
d + C2 — C3 = 0,
c x e kl + + c 3 cos kl + c 4 sin Id = 0,
c X € hl + c 2 e~ hl — C3 cos kl — c 4 sin kl — 0.
(34-16)
The solution c x ~ c 2 = c 3 * c 4 « 0, yielding y - 0, corresponds to the
straight shaft. The system (34-16) also has nonzero solutions for certain
values of k. From the first two equations (34-16) we find
c x ~ ~c 2 , c 3 — 0,
and the substitution of these values in the two remaining equations gives
C\ — c, 2 = c 3 = 0, c 4 sin kl = 0.
Thus, sin kl * 0 unless c 4 = 0, and hence
k — — » n = 1 , 2 ,
Z
Using the value of k for n ® 1 in (34-13) gives the first critical speed
t 2 jm
Wl = J \T’
The critical speeds w 2 , c*>3, . . . are determined by taking k with n = 2, 3,
PROBLEMS
1 . When a beam lies on an elastic foundation, then in addition to the transverse ex-
ternal load F(x) t there is a restoring force R — — a 2 y proportional to the deflection y .
The equation of the axis of the beam then has the form
Ely™ + a'y - F(x).
Solve this equation for F(x) — p, a constant, by assuming that the ends of the beam are
hinged so that
3/(0) - y"(0) « y(l) - i/"(0 - 0.
2. The differential equation of the deflection y of the truss of a suspension bridge has
the form
"3 ?-« + »§ -»- 4 ’
where J5T ■» horizontal tension in cable under dead load 9
k ** tension due to live load p
J? *» Young's modulus
SYSTEMS OF EQUATIONS
sec. 35]
93
I ■* moment of inertia of cross section of truss about horizontal- axis of truss
through center of gravity of section and perpendicular to direction of length
of truss
Solve this equation under the assumption that p — qh/H is a constant.
3. The differential equation of the buckling of an elastically supported beam under an
axial load P has the form
d A y P d*y k
dx* + El da* + EI V
0 ,
where El is the flexural rigidity and k is the modulus of the foundation. Solve this
equation.
SYSTEMS OF EQUATIONS
35. Reduction of Systems to a Single Equation. We saw in Sec. 13 that
it may prove advantageous to reduce the solution of a second-order equation
to the solution of a system of two equations of first order. Thus the
dynamical equation considered in Sec. 31,
dh
dt*
F(s,s',t)
with s' se ds/dt, can be reduced to a system of two equations,
-r =
at
by setting s' = v.
In the same manner, the third-order equation
« F(y,y',y",t ), (354)
in which y f se dy/dt and y n as dPy/di t 2 , is reducible to a system of three
first-order equations in x lf x 2 , x 3 defined by
y = *u v' * x 2y y” = * 3 .
With these definitions, Eq. (35-1) can be replaced by a system of three
equations:
dxi dx 2 dx z
— * * 2 , — * — = E(xi,x 2 ,X3,0. (35-2)
This procedure can be extended to nth-order equations.
A reduction of the nth-order equation to a system of n first-order equa-
tions is of some practical importance in numerical integration of equations
on differential analyzers and electronic calculators. Such computing de-
m
ORDINARY DIFFERENTIAL EQUATIONS
[CHAP. I
vices are usually so designed that it is simpler to calculate n first derivatives
than one derivative of order n. The reduction has also numerous advan-
tages in theoretical considerations.
Systems of differential equations appear naturally in problems involving
dynamical systems with several degrees of freedom. Thus, the motion
of a particle constrained to move on a surface can be described by two
positional coordinates 1 (x,t/). These coordinates satisfy equations of the
form 9
d x / dx dy \
d It 2
d 2 y
dt 2
( dx dy \
**•*'*■')■
This pair of second-order equations can be reduced to a system of four
first-order equations.
Alternatively, a system of n first-order equations can usually he re-
duced to a single nth-order equation. A general discussion of this problem
is involved, and we confine our remarks to systems of linear equations,
because such systems commonly occur in applications.
A system of n first-order equations
dy i
dt
dy_2
dt
= flllVl + «12?/2 H f* nVn + /l(0»
— 02l't/l + 0,22V 2 + * h 02nVn + /2W1
( 35 - 3 )
dy*
dt
— Onll/l O n 2 I /2 — - — |— Cl nn y n -f*
in which the a tJ and the f t (t) are continuous functions of /, is called linear .
If the/ t (0 are all zero, the system is called homogeneous.
The system (35-3) is linear because the solutions of the associated homogeneous system
satisfy the linearity properties stated in Bee. 21. Thus if
| 4 ”( 0 , .... V»\t)
and y?\D, .... y„\t)
are any two solutions of the homogeneous system, then the set of functions
OJ/S 1 ’ + «wF\ cu4” + <-2V?\ .... erf}' + r 2 „«>
is a solution of the homogeneous system for any choice of the constants c.
1 If a particle moves on a sphere, for example, x and y may be taken as the latitude
and longitude, respectively.
SEC. 35] SYSTEMS OF EQUATIONS 9?
Furthermore, it can be shown that the homogeneous system associated with (36-3) has
a set of n solutions
vi u . rf». .
..,yi‘>
first solution
yf\ yf , -
• •, y» 1
second solution
yi“\ yj w , •
...yr.
nth solution
such that the determinant
yi”
rf* --
- y { n l)
Vl 2 ’
--
■ y»>
^0.
Vl* 1
vS* -
- y» B) 1
The general solution of the system (35-3) is then given by the set of n functions
Vx “ cj y\ u -f c 2 yP H f* c n y\ n) -f u t {t), i ~ 1,2, . . n, (35-4)
where yi ** «i(0, Vs *= «2(0» . ?/n * w n (0 is any solution of the nonhomogeneous
ostein and the c t are arbitrary constants. The solution (35-4) is general in the sense
that the rs can be always chosen so that there is a unique solution of the system (35-3)
satisfying the arbitrarily prescribed initial conditions:
Ul(k) ** Vl 0 , J/s(<o) ** //20, ...» ?/ndo) ==* Vn0-
We indicate next how a system of first-order linear equations with con-
stant coefficients can ordinarily be reduced to an equivalent single linear
equation with constant coefficients whose order is equal to the number of
equations in the system.
Consider the system of two equations
dx
— + a,x + a 2 y = /i(0,
at
. (35-5)
du
— + hx + b 2 y = f 2 (t).
We introduce the operator D == d/dt and write (35-5) as
(D + ajx + a 2 y = /i(0,
&i* + (D + h*)» = co-
operating on the second equation in (35-G) with (l/iq)(Z) + oj), we get
(35-6)
(D + a,)r + ^ (i> + a,)(Z> + & 2 )y = ^ (D + ai)/ 2 (<). (35-7)
6, 6i
If we subtract the first equation in (35-6) from (35-7), we get, on multiply-
ing through by b u
(D + a,)(D + b 2 )y - b t a 2 y - (D + a{)f 2 (t) - hfi(t). (35-8)
ORDINARY DIFFERENTIAL EQUATIONS
[CHAP. 1
This is a second-order linear differential equation with constant coefficients
whose right-hand member is a known function. Hence its general solution
y y(t) can readily be obtained.
The characteristic equation for (35-8) is
(m + ui)(t n + b%) — ** (35-9)
and if its roots m = ra — m 2 are distinct, the general solution of (35-8)
is
y = Cie TO i* + + u(t),
where u(t) is a particular integral of (35-8). If (35-8) has a double root
m i = m 2 , the corresponding solution is
y * Cie m ' ( + + u(t).
Having obtained y T we can compute the solution for x, without further
integration, by substituting y(t) in the second equation in (35-5). Thus
X{t) - b 2 y(l) --]•
The procedure for reduction of larger systems or for systems of equations
of order higher than 1 is similar. 1
Example 1. Consider
dx
~~ 4 2x - 2y « t f
at
dy
dt
- 4 y
or ( D 4 2)x -2 y =* t,
-3x 4 (/> 4 l)y *
Operate on the second of these equations with 4 2) to obtain
~(D 4 2)x 4 HID 4 2 )(U 4 1) y - WP 4 2)e‘,
and add this result to the first equation. The result is
H(D 4 2)(D 4 1 )y ~2y= H(D 4 2)e* 4 <,
which simplifies to
(Z> 2 4 3Z> - 4 )y - 3e* 4 3f.
This equation can be solved for y as a function of t , and the result can be substituted id
the second of the given equations to obtain x.
1 See Example 2, p. 99.
SYSTEMS OF EQUATIONS
SEC* 35] SYSTEMS OF EQUATIONS 99
Example 2, Let the two masses M\ and M% be suspended from two springs, as indi-
cated in Fig. 24, and assume that the coefficients of stiffness of the springs are fa and
fa, respectively. Denote the displacements of the masses from their positions of equilib-
rium by x and y. Then it can be established that the following equations must hold:
Af 2
d 2 y
d?
-My - x),
(fix
Mi dfi -
x) — k\x .
These equations can be simplified to read
By setting
A
Mi
<Fy ,
fa
fa
dt 2 *
' Mi
V Mi
t fix
fa
y 4
fa 4 fa
d?
Mi
Mi
«* c
h ,
- b\
M 2 '
0 ,
• 0.
Mi
M x
the equations reduce to
(D 2 -f b 2 )y - b 2 x
—b 2 my 4 {D 2 4 a 2 4 b 2 m)x ■
0 ,
0.
Operating on the second of these reduced equations with (1 /6 2 m)(Z) 2 4 b 2 ) and adding
the result to the first of the equations give
(D 2 4 b 2 )(D 2 4 a 2 4 b 2 m)x - b<mx - 0
or [D 4 4- (a 2 4 b 2 4 b 2 m)D 2 4 a 2 b 2 ]x = 0.
This is a fourth-order differential equation which can be solved for x as a function of t.
It is readily checked that
x ** A sin ( cot — 0
is a solution, provided that w is suitably chosen. There will be two positive values of
«*> which will satisfy the conditions. The motion of the spring is a combination of two
simple harmonic motions of different frequencies.
PROBLEMS
Solve the systems:
dy
dx
dt
dx
dt
dt
3x — 2y,
dy dx
dt ^ dt
2 y,
dy
dt
dy
“2 x -y;
2z'
2 .
dx
a s
(fix
a*
dy
v ’ Tt
d 2 y
dt 2
x;
V ,
x;
dx
dt dt
6. (D 4* 1)* 4 (2D 4 1 )y - (D - \)x 4 (D 4 1 )v - 1.
7. Determine the solution in Example 1 satisfying y( 0) - 0, x(0) *» 0.
100 ORDINARY DIFFERENTIAL EQUATIONS
8. The equations of motion of a particle of mass m are
[char. 1
d 2 x
1 di*
X,
d l y
l dt*
Y,
dh
1 di 35
where x, y, z are the coordinates of the particle and X t Y, Z are the components of force
in the directions of the x, y, and z axes, respectively. If the particle moves in the xy
plane under a central atti active force, proportional to the distance of the particle from
the origin, find the differential equations of motion of the particle.
9. Find the equation of the path of a particle whose coordinates x and y satisfy the
differential equations
d x , U lly
m -~r + He -~
dt 2 dt
Ee %
0 ,
where H, E, e , and m are constants. Assume that x - y -- dxRU = dy/dl « 0 when
t * 0. This system of differential equations occurs in the determination of the ralio
of the charge to the mass of an electron
10. The currents / j and 1 2 in the two coupled circuits shown in Fig. 25 satisfy the
following differential equations*
*1 M U 2
j A'VW' — i j
§ C
o c
3c
n
*VWW V 1
oj o
^lo §^2
So
tPh d*h , p rf/2 , /*
Jw ^ + ; ' 2 ,/F + K2 dT + ^
w «W* . , . p rfli . h
M dl- +l '¥ +Rl dt + cl
0 ,
Reduce the solution of this system to that
Fig. 25 of a single fourth-oider differential equation.
Solve the resulting equation under the as-
sumption that the resistances Ry and R 2 an* negligible.
36. Systems of Linear Equations with Constant Coefficients. We have
indicated in the preceding section how a system of linear equations with
constant coefficients can be solved by reducing the problem to the solution
of one equation of higher order. In this section we sketch another mode
of attack on the problem of solving the homogeneous system
dy 1
dt
Gllth + 0122/2 + * * ’ + O-lnVny
dy 2
— - = a 2 i2/i + 0222/2 4 — * 4~ 02n2/»,
at
(36-1)
dVn
dt
0nl2/l + 0n22/2 4 + 0an2/n»
SEC. 36] SYSTEMS OF EQUATIONS 101
with constant coefficients. A third method, based on Laplace transform,
is given in Appendix B.
Let us seek our solution in the form
Vi(t) ^ k 1 e u , y 2 (l) = k 2 e u , y n (t) = k n e u , (36-2)
where the constants ki and X are to be determined so that Eqs. (36-1) are
satisfied identically.
The substitution from (36-2) in (36-1) yields
\kie Xt = (a n ki + a n k 2 H h amfc n )e X/ ,
\k 2 C Xt = {o. 2 \k\ + <*22 ^2 +**’■+■ <l' 2 nkn)e U ,
\k n P « (jlnlkl “f* 0*2^2 -f- * * * &nnkn)C .
On dividing each equation by c Kt and transposing all terms to one side, we
get the system
(ail ~~ X)*! + 012^2+ • • •+ (i\nk n = 0,
021^1 + (022 “ X)/f 2 + • • • + «2 nkn ~ 0,
(36-3)
0nl^l + 0n2^2+ ' * * + (fl«n ““X)/f n = 0.
This system is a system of linear homogeneous algebraic equations for
the unknown ks . It has an obvious solution
Ic i A 2 — — * * * kfi ■— 0
corresponding to the trivial solution
yi = th = * ' * = Vn « 0.
Since we are interested in solutions (36-2) which are not all zero, we must
seek values of the kts which are not all zero. Now, a system of Eqs. (36-3)
will have such solutions for the k t if, and only if, its determinant 1
0n X aj2 • • • 0i*
021 022 ~~ X * * * 02n
0*1 0n2 ’ * ’ 0nn X
The equation D = 0 is called the characteristic equation for the system
(36-1). On expanding the determinant, we see that (36-4) is an algebraic
equation of degree n in X, and thus it has n real or complex roots:
X “ Xj, X * X 2 , . . X = X n .
1 Bee Appendix A.
- 0. (36-4)
102 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
If all these roots are distinct, then corresponding to each root X * X*
there will be a solution of the form (36-2), namely,
Vt(0 « k x e u \ y 2 (t) =» k 2 e u \ y n (t) - k n e Ut . (36-6)
The constants hi in (36-5) must satisfy Eqs. (36-3) with X replaced by X».
When Eq. (36-4) has multiple roots, the forms of solutions corresponding
to multiple roots are more complicated. One solution corresponding to a
multiple root X ~ X t surely has the form (36-5), but there will ^lso be
solutions in the form of polynomials in t multiplied 1 by e Xtt .
To clarify this discussion we consider a simple example.
Example: Solve the system
— * 2j/i -f 3y 2 ,
at
(36-6)
dy 2 0 ,
— * 2j/i + 2/2.
We take a solution in the form
2/i - kye Xt , i /2 - k 2 e Xt . (36-7)
The characteristic equation (36-4) now reads
On expanding it we get
3
1 - X
0.
X 2 - 3A - 4 - 0,
the roots of which are Ai « — 1, X 2 ** 4. Thus, corresponding to the root Ai ® — 1 f we
have a solution
2/i » k\e~* t y 2 * kze”*. (36-8)
To determine k\ and k 2 we form the system (36-3),
(2 — A)*i Zk 2 “ 0,
2k y + (1 - X)k 2 - 0 ,
(36-9)
set A ** —1, and solve it for the k&. The result is
ky * — k 2 *
Thus, one of the ks can be chosen at will. If we take ky « a, we see from (36-8) that one
solution of (36-6) is
2/i oe""*, Vi - —ae~\ (36-10)
the constant o being arbitrary.
Another solution is obtained by taking A « 4. It has the form
2/1 * k x e At , y 2 * fae" (36-11)
1 Recall the corresponding situation in Bee. 26.
103
SEC. 36] SYSTEMS OF EQUATIONS f
with ka determined by Eqe. (36-9) with X « 4. We find this time
fa m Hfat
so that, again, one of the ks can be chosen at will If we take fa ■» b, we obtain a solution
j/i * fee 4 *, ya « ££fee 4 *. (36-12)
From (35-4) it follows that the general solution of the system (36-6) is obtained by form-
ing a linear combination of solut ions (36-12) and (36-10). We thus get the general solu-
tion
l/i * ae~ l + fee 4 *, 2/2 ■* -~ae~ < + %be u .
This solution could have been obtained more easily by the method of Sec. 35. Thus,
on writing the given system in the form
(D - 2)yi - 32/ 2 - 0,
(36-13)
~2j/i + (D - 1 ) 2/2 - 0,
we operate on the first equation with l /i(D — 1), add the result to the second, and get
X(D - 1 )(D - 2)yi - 2y, - 0. (36-14)
The corresponding characteristic equation is
Him - 1 )(m - 2 ) - 2 - 0 ,
or m 2 — 3m — 4 *= 0.
Since its roots are mi » — 1, m 2 ** 4, the general solution of Eq. (36-14) is
V\ - -f c 2 € 4t .
From the first of Eqs. (36-13) we have
1/2 « H(D - 2)2/i » - 2)<c l e- < + <**“) * -c lC ~‘ +
This checks the result found previously.
The main object of this section is not so much to provide a new method
for solving systems of linear equations but to introduce a few ideas on
which the important study of stability of solutions of differential equations
is based. There are several notions of stability of solutions, and we illus-
trate only two such by considering some simple examples.
The system
dy
dt
* x >
— = —2 bx — a 2 y t
(36-15)
is, obviously, equivalent to one second-order equation
<fy
di l
+ 25
dy
dt
+ a 2 y = 0 .
(36-16)
104 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
As we saw in Sec, 32, its general solution when b 2 — a 2 > 0 is
y * c 1 e < - b+V,? ~ SS >* 4- c 2 e { - > '~ Vb ‘-° i)t . (36-17)
If b 2 — a 2 < 0, we can write (36-17) as
y = cos Va 2 — b 2 t + B sin vV — b 2 t ). (36-18)
If b * a, we have the solution
y = e~ 6 *(c L + c 2 0* (36-19)
For 6 « 0, we have the equation
— + a 2 y = 0, (36-20)
whose general solution is
y — A cos at + B sin at. (36-21)
We observe that if b > 0, the solutions (36-17) and (36-19) are damped.
That is, | y(£) | —* 0 as t — ► *». If b < 0, these solutions are not damped
because \y(t) | — > °o for a sequence of \ allies t —> ^ As regards the case
6 = 0 f we see from (36-21) that y{i) oscillates between -f V/l 2 + B 2 and
— VA 2 + B? [see the iormula just above (32- 1)].
If we write Eq. (36-16) in the form
d 2 y _ du
— + a 2 i» = -2 by', y' = -- (36-22)
dr dt
and compare it with (36-20), we aie tempted to say that the solutions of
(36-22), for small values ol b , can differ only slightly from solutions of
(36-20), because the right-hand members of these equations are nearly
equal if b is sufficiently small. The fact that this is not so is obvious from
the foregoing remarks concerning the different behaviors of solutions of
(36-16) for positive and negative values of b.
Thus, in general, small changes (or perturbations) in the coefficients of
a differential equation may completely alter the nature of its solutions.
This remark has an important bearing on the problem of constructing
differential equations that purport to represent the behavior of physical
systems. In physical problems, the coefficients in a differential equation
are usually related to physical quantities. Such quantities are determined
from measurements which are subject to experimental errors. For this
reason, it is exceedingly important to know just what effect small variations
in the coefficients of a given equation have on the character of its solutions.
When small changes in the coefficients result in small changes in the solu-
tions, the solutions are termed stable.
SYSTEMS OF EQUATIONS
105
SEC. SO]
Another type of the stability problem occurs in the study of the depend-
ence of solutions on small changes in the initial values. In practice one
ordinarily seeks particular solutions that satisfy specified initial data.
The initial data are generally determined either experimentally or from a
specific assumption that certain physical conditions hold. (For example,
one may assume that the deflection of a beam at a given point is zero.)
If the initial conditions are altered slightly, is it true that the solution of
a given equation will not be affected by a great deal? The fact that solu-
tions of differential equations need not be continuous functions of initial
conditions is clear from the following examples.
Consider the solution of
du „
— = ~a 2 y, a* 0, (36-23)
at
subject to the initial condition y( 0) ** y$. The desired solution obviously
is
2/(0 = y ^~ aH .
Now, if yo is changed by a small amount A the corresponding solution is
y(t) = (?/o + Ay 0 )e~ ah .
Because of the factor c“~ aa< ,
1 2/(0 ~ 2/(01 0 as / co,
and hence for any e > 0 we can choose a t 0 such that
1 2/(0 - 2/(01 < « if t > t 0 .
Having chosen / 0 , we let Ay be so small that
1 27(0 - 2/(0 1 < e if 0 < / < /<>•
Then it follows that \y — y\ < e on the whole interval 0 < / < <*>, and
lienee the solutions are stable . By (30-5) similar arguments apply to sys-
tems of equations with constant coefficients, and it is found that the sys-
tem (30- 1) has stable solutious when all roots of the characteristic equa-
tion (30-4) have negative real parts.
On the other hand, if we solve
dy
— = ory, a ^ 0,
at
subject to the same initial condition y{ 0) = y 0 , we get
y(t) «* yoe aH .
(30-24)
106 ORDINARY DIFFERENTIAL EQUATIONS [CHAP. 1
On replacing y 0 by y 0 + Ay 0 we get
V(t) = (2/o + Ay 0 )e a ' 1 ,
and this time \y(t) — y(t) | = e° ,( | Aj/ 0 | • This becomes infinite as t -* «o,
no matter how small Ay 0 is, so that the solutions of (36-24) are unstable.
PROBLEMS
I. Use the method of this section to obtain the general solution of the system
dy 2
Vi 4* 2/2, -^7 - 4 y x + y 2 .
dyi
dt
2 . A system of linear second-order equations
d?yi
dt*
anVi + ai22/2 4 b n2/»,
d?y 2
d/ 2
« 2 l 2 /l 4* 0221/2 4 b 02n2/n,
dV
<fc 2
* O n i2/i 4* On2l/2 4 * 4“ 0»»2/»,
where the a,j are constants, is encountered frequently in dynamics. Show by assuming
solutions in the form
y% ** k % cos ( \i 4- «), i * 1, 2, . . n,
that one is led to the following characteristic equation for X:
on 4- X 2
Oi2
Om
021
022 4- X 2 *
02 n
Onl
0»2
Ann 4- X 2
The constants k % are determined from the system of linear equations analogous to (30-3),
and the constant a remains arbitrary.
8* Reduce the system of n second-order linear equations with constant coefficients,
d?y%
dt*
V' , ^7 , dyj
;-i j-i of
to a system of 2 n first-order equations.
t - 1, 2, »
CHAPTER 2
INFINITE SERIES
The General Theory
1. Convergence and Divergence 111
2. Some Basic Properties of Series 116
3. Improper Integrals and the Integral Test 118
4. Comparison Term by Term 122
5. Comparison of Ratios 125
6. Absolute Convergence 127
7. Uniform Convergence 132
Power Series and Taylor’s Formula
8. Properties of Power Series 138
9. Taylor's Formula 143
10. The Expression of Integrals as Infinite Seri as 147
11. Approximation by Means of Taylor’s Formula 149
Power Series and Differential Equations
12. First-order Equations 153
13. Second-order Equations. Legendre Functions 155
14. Generalized Power Series. Bessels Equation 159
Series with Complex Terms
15. Complex Numbers 166
16. Complex Series 169
17. Applications 171
Fourier Series
18. The Euler-Fourier Formulas 175
19. Even and Odd Functions 183
20. Extension of the Interval 187
21. Complex Form of Fourier Series 192
109
110 INFINITE SERIES ICHAP. 2
Additional Topics in Fourier Series
Orthogonal Functions 195
The Mean Convergence of Fourier Series 200
The Pointwise Convergence of Fourier Series 204
The Integration and Differentiation of Fourier Series 207
Although many functions encountered in applications are not elementary,
virtually every such function may be represented as an infinite series.
Nonelementary integrals like j (sin x 2 ) dx may be written down by inspec-
tion as a so-called power series, and such series also give a simple, systematic
method of solving differential equations. Another use of power series is in
the study of functions of a complex variable z = x + iy; thus, from the
series for sin x one can ascertain the appropriate definition and the impor-
tant properties of sin z. A type of series known as Fourier series arises when
one studies the response of a linear system to a periodic input, for example,
in circuit analysis, in transmission-line problems, and in the theory of me-
chanical systems. Fourier series and their generalizations are also useful
for solving the boundary-value problems of mathematical physics. Inas-
much as an indiscriminate use of series may lead to incorrect results, the
applications presented in this chapter are accompanied by discussion of the
circumstances in which those applications are valid.
THE GENERAL THEORY
1. Convergence and Divergence. A series is a sum of terms. Thus, 1 + 3
+ 5 is a series consisting of three terms, and aj + a 2 H h is a series
consisting of n terms. An infinite series is a series
Ui + a 2 + fla + • • * + a n + * * * (1-1)
which has infinitely many terms. We shall frequently use the symbol Sa n
to denote the series (1-1).
To get a numerical value for the expression (1-1) we consider the follow-
ing sequence of so-called partial sums of the series,
Si — Oi
$2 sc ai + a 2
«3 ** a>\ + a >2 a 3
s n = a x + a 2 + a 3 + • • * + a n
111
(1-2)
112 INFINITE SERIES [CHAP, 2
and examine the limit of the nth partial sum s n as n — » *>, If
lim s n » s, (1-3)
we say that the series converges to the sum s and write
s *= u i *T 4* ^3 + • • • + a n + * ' • •
If the limit of s n does not exist, the series (1-1) is said to diverge , and no
numerical value is assigned to the series. The precise meaning of the state-
ment (1-3) is that for any preassigned positive number €, however small } one
can find a number N such that
\s — s n \ < € for all n > N.
To Illustrate the definition (1-4) consider the series
111 1
- — -{- — - -}” — - -j- • ' * *4~ - — -f* * *
1-2 2 3 3-4 n(n + 1)
The first three partial sums of (1-5) are
ai
2 1
, 11 2
s 3 = s^a 3 ~l + 3 ! - -
(1-4)
(1-5)
and the nth partial sum is $„ » n/(n + 1) (of. Prob 1). It is obvious that the limit
of s n as n —► oo is 1. If, however, we want to prove this fact, we must demonstrate that
for any preassigned number « > 0 we can find a number N such that the condition (1-4)
is satisfied for all partial sums s n with n > N. In our problem
\S — «n| 33
Given « > 0, we require, then, that
1
1 -
1
and this is equivalent to
n + 1
n + 11 n + 1
< f for n > N
n 4- 1 >
for n > N.
Hence the choice N =* (l/«) — 1 fulfills the requirement of the definition. If e » 3d o,
then N ■* 9; if e *= Koo» then N ** 99, and ho on. To attain higher accuracy in approxi-
mating the sum of the seiies (1-5) by its nth partial sum s n we must, clearly, increase n.
The number e in (1-4) can be thought of as a measure of error made in
approximating the sum s by the sum of its first n terms. The actual error
in the approximation is
r« « s
(1-6)
THE GENERAL THEORY
113
SEC* 1]
and the condition (1-4) demands that |r n | < c for all sufficiently large
values of n. We shall call r n the remainder of the series (1-1) after n terms.
The limit (1-3) may fail to exist either when s n increases indefinitely with
n or when the partial sums s n oscillate without approaching a limit as
n — > oo. Thus, the series
1 + 1 + 1 + 14
diverges because its nth partial sum s n — n increases with n without limit,
while the series
1 - 1 + 1 - 1 +•••
diverges because its partial sums a t *= 1, s 2 * 0, s$ = 1, . . . oscillate.
As another example, consider the so-called harmonic series
l + H + H + /^ + H + M+ H + 3 / 8 + * — h 1/n H . (1-7)
The terms of the series (1-7) may be grouped as follows:
i + A + (A + H) + 04 + Ye + Yt + A) + 04 4 f He) 4 — *•
(1-8)
Now, each term of the foregoing series is at least as large as the correspond-
ing term of
Yi + A + 04 4* H) + (A + A 4~ A + Hi) + (He 4 f He) -r — .
d-»)
The latter series, however, reduces to
3 / 2 + /4 + K + 3^ + /4 + ’* # >
which is divergent. Hence (1-8) is divergent.
Tins example illustrates the idea of comparison , which is fundamental in the study of
series. The divergence of (1-8) was established by comparing (1-8) with a simpler series,
V T~9), whose divergence is obvious. The full chain of reasoning is as follows: “Each term
of (1-8) is at least as great as the corresponding term of (1-9). Hence the partial sums
of (1-8) are at least as great as the corresponding partial sums of (1-9). But the partial
sums of (1-9) become arbitrarily large if we take enough terms. Hence the partial sums
of (1-8) also become arbitrarily large, and the series diverges ” The student who under-
stands this example will have no difficulty with the more detailed applications which
follow.
The use of the criterion (1-4) for convergence of the series (1-1) requires
knowledge of its sum 5. F requently it is possible to infer the existence of a
limit s without knowing its value. For example, consider the series
0.1 + 0.01 + 0.001 + •**
whose partial sums are s* ~ 0.1, s 2 = 0.1 + 0.01 =» 0.11, « 0.1 + 0.01
+ 0.001 * 0.111, and so on. Each partial sum, being a decimal, is less
114 INFINITE SERIES [CHAP. 2
than 1. On the other hand the & n increase with n. If the successive values
of s n are plotted as points on a straight line (Fig. 1), the points move to the
right but never progress as far as the point 1. It is intuitively clear that
there must be some point s, at the left of 1, which the numbers s n approach
as limit. In this case the numerical value of the limit was not ascertained,
but its existence has been established with the aid of a Fundamental
Principle: If an infinite sequence of
numbers s n satisfies the condition s n+ i
> s n for each n, and if s n < M, where
M is some fixed number, then s n has a
limit that is not greater than M. In
Fig 1 other words: Every bounded increasing
sequence has a limit . Considering
Sn instead of s n gives a corresponding statement for decreasing sequences.
From the geometrical interpretation of the fundamental principle it
appears that when an increasing sequence of partial sums s n has a limit,
the difference between the successive values of s n must tend to zero as
n —► oo. Since $ n — s n _ x = a n , the foregoing statement is equivalent to
the assertion that lim a n = 0. This can be established from the defini-
n — * *
tion (1-3) without appeal to the fundamental principle and without the
assumption that s n is increasing.
Indeed, since
ein “ $n Syi — j (1-10)
and since the series converges by hypothesis, we have lim s n = lim $ n „i = s
as n — > oo. Hence (1-10) shows that
lim a n = lim s n — lim s n ^i = 0. (1-11)
We state the result (1-11) as a theorem:
Theorem I. If a senes converges , then the general term must approach
zero , and hence if the general term does not approach zero, the series diverges.
The reader is cautioned that the converse of this theorem is not true.
For instance, the harmonic series (1-7) was found to diverge even though
the general term 1/n approaches zero.
There is a more elaborate version of Theorem I which does have a converse. By
writing out the sums m full we find a relation analogous to (1-10):
Om + 1 H h On s «« - \ t 71 > m > 1 . ( 1 - 12 )
If the infinite scries converges, so that lim s n * s, then both the sums on the right of
(1-12) become arbitrarily close to s, provided ?n and n arc chosen large enough. Hence
the right-hand side becomes arbitrarily small in magnitude, and we are led to the follow-
ing: IfXak converges , then for any e > 0 there is an N such that
\Om 4 a m + 1 4 * b ttnl < «
(1-13)
THE GENERAL THEORY
115
SEC. 1]
whenever n > m > N. Now this statement admits a converse. 1 //, for each e > 0,
there is an N such that (1-13) holds whenever n > m > N, then Xak converges. The
theorem, together with its converse, constitutes the so-called Cauchy convergence criterion .
Example 1. A certain series has partial sums s n » r n , where r is a constant such that
0 < r < 1. By use of the fundamental principle, show that the series converges to zero.
We have to show that lim «„ *■ 0 as n — ► or in other words
limr n
0
for 0 < r < 1.
(1-14)
Since r > 0, it is evident that s n > 0, and hence the sequence s n is hounded from below.
Also r n+1 « rr n , or in other words
rSn- ( 1 - 15 )
Since r < 1, this shows that s n -i~i < s«, so that the sequence *« is decreasing. Hence the
limit of 8 n exists by the fundamental principle. If we write s » lim s n and take the
limit as n — ► « in ( 1 - 15 ), there results
8 « lim a„+i « lim ( rs n ) « r lim s„ *■ rs.
From s ■» rs it follows that s 0, since r ^ 1, and this gives (1-14).
Example 2. The geometric series is defined by
1 + x 4 x 2 + x 3 H + x n H .
Show that this series converges to 1/(1 — x) when |x| < 1 but diverges
when | x j > 1.
The geometric series is an example of a series
u x {x) 4 u 2 (x) + u 3 (x) H b u n (x ) 4 • • *
in which the terms are functions of x. For each choice of x the function
u n {x) is simply a number, the series becomes a series of constants, and
hence it can be tested for convergence just as any other series of constants
is tested.
We have to decide whether the partial sums
s ft =l + x + z 2 +"4 * n ~ 2 4 x”- 1 (1-16)
tend to a limit. If the foregoing equation is multiplied by x, there results
xs n
x 4 x 2 H b x n 4 x n
(1-17)
and subtracting (1-17) from (1-16) yields s n — xs n = 1 — x n . Solving for
s n we get
1 — x n
Sn m (148)
1 — x
1 Since we shall not require the converse, the proof is not presented here. The inter-
ested reader is referred to I. S. Sokolnikoff, “Advanced Calculus,” pp. 11-13, McGraw-
Hill Book Company. Inc., New York, 1939.
[chap. 2
116 INFINITE SERIES
[f |$| < 1, then lim \x\ n « 0 by (1-14) and hence (1-18) gives
» -+ ao
i-o 1
iim s n = ** *
n « 1 — x 1 — x
This establishes the required convergence when \x\ < 1. On the other
hand if |»| > 1, the general term does not approach zero and the series
diverges by Theorem I. The value x is called the ratio for the series, since x
equals the ratio of two successive ter mk. We have shown that the geometric
series converges if, and only if, the ratio is less than 1 in magnitude.
PROBLEMS
1. Show that the nth partial sum of the series (1-5) is n/(n -f 1) Hint' Since
l/[w(T4 4* 1)] * l/n — l/(n -f 1), the sum of the first n terms is
Sn - (H -H) + (A - H) + (U -H)+--- + [1/W - l l(n 4* 1)1
A series such as this is called a telescoping series .
2. Show that the following series converge to zero if |r| < 1 but to 1 if r « 1 Sketch
the graph of the sum as a function of r:
r 4- (r 2 - r) 4- (r z - r 2 ) 4* • • 4- (r n - r n ~ 1 ) +
2. Some Basic Properties of Series. We .shall write infinite series in the
condensed notation
oo
a n 555 #1 4“ "b a 3 + * ’ • + U M 4" * * * . (2-1 )
n* 1
Finite sums are expressed similarly, with the limits of summation (1 ,qo)
replaced by the appropriate values. The limit s of summation are frequent ly
omitted if they need not be emphasized or are clear from the context.
Whenever the limits are omitted in Sees. 2 to 7 of this chapter, the reader
may assume that the summation range is from 1 to oo.
In many respects convergent series behave like finite sums. For example,
if the sum of the series (2-1) is vS and if each term of the series (2-1) is multi-
plied by a constant p, then
2pa n = p2a n ~ ps. (2-2)
That is, a convergent series may be multi plied termwise by any constant.
The proof of (2-2) follows at once from die observation that the partial
sums & n of the series 2pa n are related to the partial sums s n of (2-1) by
S n - P*n
lim S n * p lim s n = ps.
and therefore
THE GENERAL THEORY
117
8EC. 2 ]
If we are given two convergent series 2a* and 26*, then
2 (a* ± 6*) — 2a* dh 26*. (2-3)
That is, two convergent series may be added or subtracted term by term. Again
the proof is simple. We denote the sum of the series 2a n by A, that of 26 n
by B, and the corresponding partial sums by A n and B n . Then the nth
partial sum of 2 (a* zb 6*) is
n
4: ” A n zb B n
k*~l
and the result (2-3) follows on letting n —► qo.
As an illustration, consider the geometric series
By (2-2)
:rs » x 4* x 1 -f • • • -f r n 4
and hence, by (2-3), we have * — jts =» 1. This shows that if the series converges , it must
converge to 1/(1 — x). The question of convergence was discussed m Example 2 of
Sec. 1.
Another obvious but important property is used so often that we state
it as a theorem:
Theorem I. If finitely many terms of an infinite series are altered the
convergence 'is not affected ( though , of course, the value of the sum may be
affected).
To prove this we denote the original terms by a* and the new terms by
a* + 6*, where all but a finite number 1 of 6*s are zero. The result is then
a consequence of (2-3). It should be noticed that this argument not only
establishes convergence but shows that the new value of the sum can be
found by the obvious arithmetical calculation. For instance, if the seventh
term of a convergent series is increased b}" 2.1 the sum is also increased by
2.4, and similarly in other cases.
Example: Establish the divergence of
h.2 + he 4- ho 4* Via 4 • (2-4)
Multiplying by 4 we get the senes
Vs + X + H + H 4 — *
which is obviously divergent, since it differs from the harmonic series 2l /n only in that
it lacks the first two tcirtis. Hence (2-4) is divergent.
1 Any finite series h 4- ■ * • 4* &« may be regarded as an infinite series with all terms
beyond the nth equal to zero. If wo do so regard it, the definition of convergence given
in Sec. 1 makes the finite series converge to its ordinary sum.
118 INFINITE SERIES [CHAP. 2
This use of (2-2) to establish divergence is readily justified, even though (2-2) applies
to convergent series only. Thus, assume that (2-4) converges. The foregoing analysis
shows, then, that Si fn would have to converge, and that is a contradiction.
PROBLEMS
1 . Write the following series in full, without using 2 notation:
^ 1 ~ / 43\» n 2 + 1 ^ 3
hi 2k' h\ U/'hi'Zn + Z hj' hi
(cos x) .
2 . Write the following series in condensed form, using 2 notation:
(H) 2 4-(H) 3 + (H) 4 4-(H) 5 4-**-,
T[o55 + 1^002 + 1^004 + 1^006 + ’ ’
1 + -L + -L-
11-2 1-2-3 1-2-3-4
H + H 4- M 2 4 - Ms 4-* •
0.2 - 0.02 + 0.002
Ho + Ho + Mo + Mo 4 •
3. Some of the series in Probs. 1 and 2 are divergent because the general term does
not approach zero. Which ones are they?
4 . Some of the series in Probs. 1 and 2 are convergent because they are geometric
series with ratio less than 1 in magnitude (or multiples of such a series). Which ones
are they?
6. Some of the series in Probs. 1 and 2 are divergent because 2l/n is divergent
Which ones are they?
0. Shovr that (1 — 1) 4- 0 — 1) 4- (1 — 1) 4 converges but would diverge if the
parentheses were dropped.
7. (a) Does the series 2 bb+©] converge? Explain, (b) Does the
converge? Explain. Hint In (a) see (1-5). In (6) note that
HMD] 4* . If the given series converges, what could you deduce about
3. Improper Integrals and the Integral Test. In the development of the
fb
calculus a definite integral such as / f(x) dx is defined, at first, only for a
Ja
finite interval [a,b]. The extension to an infinite interval is then made by
a simple passage to the limit; thus
f f(x) dx — lim [ f(x) dx .
Ja h — * ao a
The integral at the left of (3-1) is called an improper integral. If the limit
at the right exists, we say that the improper integral converges (to the value
SBC. 3J THE GENERAL THEORY 119
of the limit) and it diverges if the limit does not exist. The definition is
quite analogous to the corresponding definition
*> n
2^= lim J2 a k
k«*l n O0 kw~l
for infinite series.
An example of a divergent improper integral is
r -dx ** lim / ~ lim (log x |}) «« lim log b «•* (3-2)
J i x J i x
On the other hand if p is constant and p s* 1, then
r \dx - lim dx - lim f) - lim — - — - ■ (3-3)
/i x p J i VI — p If/ 1 — p
The question of convergence now depends on the behavior of b l ~ p as b — ► «>. If the
exponent 1 — p is positive, then b l ~ p — ► « and the integral (3-3), like (3-2), is divergent.
But if 1 — p is negative, then p — 1 > 0 and hence
b l ~ p **= —► 0, as b — * oo.
In this case the integral (3-3) converges to the value l/(p — 1).
The result of this discussion may be summarized as follows:
r* 1
Theorem I. The improper integral / — dx converges if , and only if, the
Ji x v
constant p > 1.
Theorem I suggests the following analogous result for infinite series:
00 i
Theorem II. The infinite series ^2 ~~~ converges if , and only if, the con-
k-i k p
slant p > 1 .
It will be seen that Theorem II is valid ; in fact, there is a close connec-
tion between infinite series and improper integrals which will now be dis-
cussed.
Suppose the terms of an infinite series 2 a* are positive and decreasing;
that is, a n > a n+ i > 0 for each positive integer n. In this case there is a
continuous decreasing function f(x) such that 1
&n 355 /(r)j n — 1, 2, 3, .... (3-4)
Each term a n of the series may be thought of as representing the area of a
rectangle of base unity and height f(n) (see Fig. 2). The sum of the areas
1 For instance, let the graph of y ** f{x) consist of straight-line segments joining the
points (n,o») and (n -f- 1, a n +i)-
INFINITE SERIES
120
[chap. 2
of the first n circumscribed rectangles is greater than the area under the
curve from 1 to n + 1, so that
rn+l
a t + a 2 -\ h a n > J f(x) dx. (3-5)
/ 00
fix) dx diverges , then the sum 2 Jo* also
diverges.
On the other hand, the sum of the areas of the inscribed rectangles is
iv
Fiu. 2
less than the area under the curve, so
that
02 + 03 + • * * + ^ f x fi x ) dx. ( 3 - 6 )
If the integral converges, we have [since
fix) > 0 ]
^ fix) dx < f(x) dx ss M,
so that the partial .sums are bounded
independently of n:
$„ = of! + a 2 H a n < M + a x .
Since each a* is positive, these partial sums form an increasing sequence.
Hence, the fundamental principle stated in Sec. 1 ensures that ~a k is con-
vergent.
The result of this discussion may be summarized as follows:
Theorem III. For x > 1 let fix) be positive, continuous , and decreasing.
00 r°
Then the series Y) fin) and the integral / fix) dx both converge or both di-
rt- 1 h '
/n either case the partial sums are bounded as follows:
n + l n n
f SO) rfx < E SO) < fjO) dx + SO). (3-7)
Jl lea*. 1
Choosing /(x) = x~“ p in Theorem III, we see that Theorem II is a conse-
quence of Theorem 1. The test for convergence contained in Theorem III
is commonly called the Cauchy integral test , though it was first discovered
by Maclaurin. The result (3-7) is especially useful because it enables us
to estimate the value of the sum.
Example 1 . Show that the series
_J_ + _J__ + __L_ + _JL_ + . . . + _L_ + . . .
1 + l 2 ^ 1 + 2 2 1 + 3 5 1 + 4 2 ^ l+n*
converges to a v^lue which is between 0.7 and 1.3.
121
SEC. 3] THE GENERAL THEORY
Here we choose /(x) ■» 1/(1 -f ar 1 ). Since
1 , . . ,fc r ** r
j as 6 -* *>,
the integral is convergent, and hence the series is convergent. Moreover
„ _ IT * 1 7T 1
0.79 * 7 < E 7-T-7;, < 7 + - * 1.29
4 m 1 -h k 2 4 2
C
by letting n — ► « in (3-7) and noting that /(l) * V 2 . The next example shows how the
accuracy in such an estimate may lx? improved to any extent desired.
Example 2. Compute the sum of the following series within JrO.Ol:
1 •— d — ~f -f- - ~ -j- — -4" •
4 9 lti 25 36
n 1
It is easily verified that the first six terms give the sum 1.491. To estimate the re-
mainder we have, from (3-7) on taking /(x) *= 1 /(x 4 6) 2 ,
f (xTo ? dl < ? («+>? < .(V + 0 ? dl + S' (3_8)
The two limits in (3-8) are 0 143 and 0 163, as the reader can verify. Hence
1.634 - 1.491 + 0.143 < « < 1 491 + 0 163 « 1.654. (3-9)
It is interesting to see how many terms are needed to get the same accuracy by direct
computation. The remainder aftei n terms is given by (3-7) as
t f \<ix- -•
Jn+iX* n + 1
To make tins as small as the uneeitainty interval 1 054 — 1 634 obtained in (3-9), we
must have l/(n -f 1) < 0 02, or n > 19 Thus, direct summation of the series requires
almost 50 terms for the aocuiaey which we obtained by adding 6 terms only.
PROBLEMS
1. Test the following integrals for convergence, and evaluate if convergent:
dx dx
r r e -, dXt /v.*, r_* r
Ji 1 + x Ji Ji h x(log x) 2 J 2 .
x logs
2 . Test the following series for convergence:
1 ^ A 1
' <n + 1) H
1 y' 1 y I v
’ n»2 n(Iog n) ^2 n(log n)' 01 w 1 -f n 2
3 . (a) For what values of the constant c does j^e CT dx converge? ( b ) Using the
result (a), discuss the convergence of 2c cn . (r) Show that the series (h) is a geometric
series, and also show that your results are consistent with those of See. 1.
4. How many terms of the harmonic series Sn - " 1 are needed to make the sum of those
terms larger than 1,000?
INFINITE SERIES
122 INFINITE SERIES [CHAP. 2
00
5. Estimate the value of £ n~ 4 by direct use of Theorem III and also by adding the
»«*i
first five term® and using Theorem III to estimate the remainder. In both cases find
approximately how many term® of the series you would have to add up to get comparable
accuracy.
Problem for Review
6. (a) By (1-18), show that the partial sums of the series
1 + 2 + 2* + 2*~l •" 2 " + "'
are all less than 2.
(6) Show that the partial sums of the series
i -fJL-i._L-t-.I-i — lIj —
21^3! 4!^ n!^
are also less than 2. Hint * Compare the partial sums with those of the series (a),
(c) Deduce, by the fundamental principle, that the series (6) converges.
4. Comparison Term by Term. One way to test a series of positive
terms for convergence is to compare that series with another whose con-
vergence is known. Let 2a n and 2b n be two series with positive terms such
that a n < b n and 2b* converges. The inequality
n n ac
®n « £ <>n < £ b n < £ b n
1 1 1
shows that the partial sums $* are bounded, and since s n is increasing, the
limit exists by the fundamental principle. It is left for the student to verify
also that if a* > b n > 0 and 2b* diverges, then 2a n diverges
This discussion establishes the following result, known as the comparison
test:
Theorem I. If 0 < a n < b*, then the convergence of 2a n follows from the
convergence of 2 b n . And if a n > b n > 0, then the divergence of 2a* follows
from the divergence of 2b„.
Since the first few terms of a series do not affect the convergence, we need the hy-
pothesis not for all n but only for n sufficiently large (see Sec. 2, Theorem I). Similar
remarks apply to every convergence test, and we shall make constant use of this fact in
the sequel.
For example, suppose we want to establish the convergence of 29/n n . Although the
inequality
9 1
is not valid for all n, it is valid when n is sufficiently large. Hence the series converges by
comparison with the geometric series. Another example is given by the series
„?2 100 log n' (4 ~ 2)
BBC. 4]
Although it is not true that
THIS GENERAL THEORY
123
1 1
100 log n n
for all n, this is true for all sufficiently large n, and hence the series (4-2) diverges by com-
parison with the harmonic series.
It is customary to write a„ ~ b n (read “a n is asymptotic to b n ”) if
<Z n
lim — — 1
n -+ *>b n
(compare Chap. 1, Sec. 2). For example, n + 1 ^ n and also 5 n 2 + 3n
+ 4 ^ 5n 2 , but it is not the case that 2/n ~ 1/n even though the difference
between these quantities tends to zero. In this notation we can state the
following theorem, which is very useful for determining convergence:
Theorem II. If a n ~ b n and b n > 0, then the series Xa n and Xb n are both
convergent or both divergent.
The proof is simple. Since lim ( a n /b n ) = 1, we shall have
1 a n
- < — < 2
2 b n
whenever n is sufficiently large. Equation (4-3) yields
Vzb n < a n < 2 b n
(«)
and hence the conclusion follows from Theorem I together with (2-2).
Example 1. DoesSw -10 *" converge?
For all large n we have log w > 2 (since log » — * «). Hence
for all large n, and the series converges by comparison with the convergent series Xl /n 2
(Theorem II, Sec. 3).
Example 2. Does X(n 2 -f 5n + 3)“** converge?
Inasmuch as n 2 -f bn -f 3 *« n 2 (l -f 5/n + 3/n 2 ) ^ n 2 , we liave
(n 2 + 5n + 3)“* ~ (n 2 )~* - n~‘ - ~
n
Since Si /n diverges, the given series diverges.
Example 3. Consider the series
_ / ra 4 + 4n» + 1
\7 n 1 + 5n* + 8 n)
Since n* + 4n* + 1 ~ n 4 , and since 7n 7 + 5n 4 + 8n
/ n 4 \H J_ _1_
\7nV ”7 m t»W
totic to
i n , tue general
— — r
The series with general term 1 /n* converges by Theorem II, Sec. 3, and hence the given
series also converges.
124
INFINITE SERIES
[CHAP. 2
Examples 2 and 3 illustrate two properties of the relation which
are now set forth explicitly. First, we show that any polynomial is asymp-
totic to its leading term. Indeed, if a 0 and m > 1 , then as n °o,
an m + bn m 1 H — • + rn + s b
z * 1 + " — I f~
an
an
an
tn— 1
+
— > 1.
an
This shows that an m + bn m 1 H 1- rn + s ^ an m , as stated.
Second, if a n ^ b n and c n ~ d n , then it follows that
~ bldi
for any constants a and To establish this consider the ratio
= /On\ a /cA*
b a X \bj \dj
in' 3 = l.
PROBLEMS
1. Test the following series for convergence by comparing with the series 2l/a p :
v _£ y 1 V ,lS v _Jl
V r t 2ny/ n + 1 (2« + l) J (2» -I- 1)*
2. Test the following series for convergence by using Theorem It :
s 5l± A vAt "!, v (l , J y v 3 1+J!, v «l±I.
n 3 5 + 1 3n 6 -H n \n n 2 ) 4 n -f 5 n a 4 -f 4
3. Test the following series for convergence by any method:
1 In 4
y.-n* V f V . } V
’ w log (n -f 1) ~ n^” w « 6 -j- 3
4. (a) If a n ~ ?> n and b n ~ c n , show that a n ~ c* (/>) If a n ~ h* and c n ~ d n , is it
necessary that a„ t r» ^ h n -f ri n ? Prove your answer by an eximple. (r) Find a n
and 6 n such that a n ~ b n but a n — b n — > oo {<!) Find a n and b n such that a n /b n <*>
but a n — b n 0.
Problem for Review
5. (a) By direct use of the definition of limit show that 0.111111 ...= Hint *
If *» 0.1, 82 33 0.11, ss * 0.111, ...» then |si — %\ « l«2 — I 33 i6oo» and
SO on. (6) With s n as in (a), and with e > 0, how large must you choose N to make
|s« — H I < « for all n > N?
(c) If s **0.111111..., evaluate s by considering 10s — s. (d) Evaluate s in (c) by the
formula for sum of a geometric series.
THE GENERAL THEORY
125
SEC, 5]
5. Comparison of Ratios. It often happens that the general term of
an infinite series is complicated whereas the ratio of two successive terms is
simple. For example, in the series
we have
a n
x 2n + 2 n\ _ x 2
(n + 1 ) ! x 2n n + 1
(5-1)
(5-2)
The following theorem enables us to deduce convergence by considering
this ratio rather than the general term itself:
Theorem I. Let 2a n and be two series with positive terms . If
a n + 1 bn 41
cifi bfi
1 , 2 , 3 ,
(5-3)
then the convergence of Za n follows from the convergence of And if
Gn-f 1
a n
* = 1.2,3,
(5-3a)
then the divergence of 2a n follows from the dierrytnee of Zb n .
The proof is simple. Tn the first case we have
a 2 a 3 a n b 2 b 3
a n = ai < <h ~ -
a i « 2 <*n~i fh b 2
b n a x
SB — l )
bn - 1 b x
Hence the convergence of 2S5 n implies that ol 2 a u by the comparison test
(Theorem I, Sec 4). The discussion of (5-3a) is similar.
If we take b n — r n in Theorem I, then ~b n converges whenever r < 1.
Also
bn + 1
j.n ~H
r.
b
n
r
n
Hence the theorem shows that 2a n converges if there is a fixed number
r < 1 such that
i»-l,2,3, .... (5-4)
Cl n
Since the condition
verges whenever
(5-4) is needed only for large n, the series 2a n also con-
lim =* r < L
n « a n
(5-5)
126 INFINITE SERIES [CHAP. 2
The test based on (5*4) and (5-5) is termed the ratio test. To illustrate
the ratio test consider the series (5-1). By (5-2) we have
,. Gn+l
lim = lim = 0
a n n + 1
and hence (5-5) holds for all x. Thus the series (5-1) converges for all x.
The ratio test is useful but very crude. It cannot even establish the convergence of a
series such as 2n~” 1(K) , which is rupidly convergent. To obtain a better test one may use
the series 2l/n p for 26 n rather than the geometric series In this case
frn+l
&n
1
vn 4- 1)*
n v
c-tt)*- o+r-
By the binomial theorem 1
and hence
V , P (P 4- 1)
n 2 n 2
bn±l _ ^ _ P
b n n
Since Xb n converges if p > 1 and diverges if p < 1, we are led to the result stated in
part (6) of Theorem II. The result (5-5) is stated in part (a)
Theorem II. Let Xa n be a series of positive terms, and let r arui p be con-
stant . (a) If a n + 1 /a„ ^ r, then Xa n converges when r < 1 and diverges when
r > 1. (b) If a n +i/a u — 1 ~ -~p/ri , then 2a n converges when p > 1 and
diverges when p < 1.
Example 1. Does 2n 2 /2 n converge?
With On 30 n 2 /2 n we have
fln+i (a 4" 0~ 2”
~aZ ** ”7?
Hence the series converges by the ratio test, Theorem I la.
Example 2. Apply the ratio test to the harmonic series.
With On * 1/nwe have
fln-fl ^ n ^ ^
a„ n + 1
Since this is the case r « 1, the test gives no information. Moreover,
fln-f l ] cs 71
a n n 4 1
1 1
-- - » • — — *
n 4 1 n
Since this is the case p «- 1, the more refined test of Theorem 116 also gives no infor-
mation.*
1 The binomial theorem for arbitrary exponents is established in Sec. 12.
s More general tests may be found in I. S. Sokolnikoff, “Advanced Calculus/' chap. 7,
McGraw-Hill Book Company, Inc., New York, 1939.
SEC. 6J THE GENERAL THEORY 127
Example 3. For what values of the constant c does the following converge?
c . c(c + 1) . c(c + l)(e + 2) ,
— •+•«
1! 21 3!
For sufficiently large n the terras are of constant sign, and hence Theorem II is ap-
plicable. We have
firm c(c 4- l)(c 4- 2) ... (c 4- w) n* c + n
a n (n -f 1)! r(f 4- 1) . . . (c + n - 1) n + 1
Since
c -f n ^ ^ ^ c - 1 c ~ 1 ^ ^ 1 - c
n + 1 n + 1 n n
the series is convergent if l — c > 1 and divergent if 1 ~ c < 1. Hence, it is convergent
when c < 0 and divergent when c > 0. In this example Theorem I la gives no informa-
tion but Theorem 116 solves the problem completely
PROBLEMS
1. Determine the convergence by using the ratio test, Theorem Ila:
X
n!
n*
v fr 2 -f l) n
W ’ n!
2. Show that Theorem Ila gives no information, and test for convergence by The
orem 1 16;
v } ? v *}' f v l 2 alL.
n 2 ** r(c -j~ l)(c -f- 2) . . . (c + n) 4 n (n!) 2
3. Test for convergence by any method :
• n<_ log n, v 2"_+JP f v 2n - 1
1*3*5... (2n -4-1) n 2 * w 3 n + 4"' " 2n +T
4. Give an example of a divergent series Xan such that all the terms satisfy a n > 0
and a n+ i/a„ < 1. Does this contradict the remarks made m connection with (5-4)?
5. If j is constant prove that lim x n /n\ » 0. 11 ml The series X j x\ n /n' converges
n — * »
by the ratio test.
6. Absolute Convergence. The preceding tests for convergence apply
to series with positive terms. We shall now see how these tests can be used
to establish convergence even when the signs of the terms change infinitely
often. 1
Definition. A series Xa n is said to he absolutely convergent if the series
of absolute values X | a n | is convergent .
1 If all but a finite number of terms have the same sign, then we mMy consider those
terms only (Sec. 2, Theorem I). Multiplication by — 1, if necessary, yields a series with
positive terms, so that the foregoing methods apply. This fact was used in Example 3
of the preceding section.
128
INFINITE SERIES
[CHAP. 2
For example, the series 2 (~~l) n /n 2 is absolutely convergent, since
ZIa*|
2
converges. On the other hand the series 2( — l ) n /n is not absolutely con-
vergent, as the reader can verify. The importance of absolute convergence
sterns partly from the following theorem:
Theorem I. 7/2|a n | converges, then 2a u converges .
In other words, an absolutely convergent series is convergent . The defini-
tion of absolute value yields
0 <
a n +
2
Hence, by the comparison test, the series
v (l n + i «n |
Zj
2
converges when 2la n | converges And then the series with general term
a n —
converges by (2-2) and (2-3).
To illustrate the use of Theorem I consider the series
v cos nx
( 6 - 1 )
Since the signs change infinitely often, 1 none of the preceding methods is applicable
We may, however, apply those methods to the series of absolute values. In view of the
fact that
| ros nx | 1
— - ~ C ~ ?
2« — 2 n
the series
is convergent. Hence the original series ((>-1) is convergent.
A series whose terms arc alternately positive and negative is called an
alternating series . There is a simple test due to Leibniz that establishes
the convergence of many such series even when the series does not converge
absolutely.
1 Except when x is an integral multiple of 2 r.
ros nx
cos nx
2"
129
BSC. 6] THE GENERAL THEORY
00
Theorem II. Suppose the alternating series 53 (-l) n+1 a„ is such that
n—l
a n > a n+1 > 0 and lim a n = 0. Then the, series converges , and the remainder
after n terms has a value which is between zero and the first term not taken.
For example, if the sum of the series is approximated by the first five
terms
8 * $5 = eii — a 2 + — a± +
then the error in that approximation is between zero and -~oq :
0 > 8 — $5 > — Ofi.
The value given by is too large, because s r> ends with a positive term, +a 6 .
The value sq is too small, since ends with a negative term, and so on.
To prove the theorem, we have
* <ai — O2) + (a 3 — 04) H b (o^n-l — 02 n)
“ <*1 («2 “ «3) ~ * ~ (02n-2 ~~ «2n-l) “ «2n
and hence $2„ is positive but less than a 1 for all n. Also
82 < 84 < 8 b * . .
so that these sums tend to a limit by the fundamental principle (Sec. 1). Since **
«2n -f a2 n -f i and lim a2M \ 38 0, it follows that the partial sums of odd order tend to
this same limit, and hence the series converges. The proof of the second statement is
left as an exercise for the reader. Actually Theorem II becomes rather obvious when we
plot the partial sums on the t axis.
Since the choice a n — 1/n satisfies the requirements of Theorem II, the
alternating harmonic series
11111 (~l) n+1
*« 1 ~;; + o - - + + •
2 3 4 5 6 n
(6-2)
is convergent. If the sum is approximated by the first two terms, then
Theorem II says that the error is between 0 and that is, 0 < s — 3^
< H, or
< 8 < %. ( 6 - 3 )
Inasmuch as the series of absolute values diverges, we could not establish
the convergence by use of Theorem I. A series such as this, which con-
verges but not absolutely, is said to be conditionally convergent
By rearranging the order of terms in a conditionally convergent series, one can make
the resulting series converge to any desired value. In illustration of this fact we shall
rearrange the series (6-2) in such a way that the new sum is w, though (6-3) shows that
the original sum is not
[chap. 2
130 INFINITE SERIES
The terms of (6-2) are obtained by choosing alternately from the series
l + X + H + K +■■■ (6-4)
and from the series
-H - H - H - 14 («)
both of which are divergent. To form a series that converges to it, first pick out, in order,
as many positive terms (6-4) as are needed to make the sum just greater than 7 r. Then
pick out, in order, enough negative terms (6-5) so that the sum of all terms so far chosen
will be just less than ir. Then choose more positive terms until the total sum is just
greater than ir, and so on. The process is possible because the series (6-4) and (6-5) are di-
vergent; the resulting series converges to r because the error is less than the last term
taken.
To get a physical interpretation of this result, suppose we place unit positive charges
P at the points
x - i, -V2, Vz, -VI, VS, -VS
and attempt to find the force on a unit, negative charge N located at the origin (see Fig.
3). By Coulomb's law two opposite unit charges a distance \/n apart experience an
— V8 — V© — V4 —V2 0 fl V3 fS V7
Fig. 3
attraction of magnitude 1/n. Since the attraction of charges at the left of N exerts a
force toward the left whereas attraction of the other charges exerts a force toward the
right, the total force on N is given formally by the series (6-2). Now, the fact that
this series is conditionally convergent makes the force dejiend not only on the final con-
figuration of charges but also on the manner in which the charges were introduced. If
we obtained the final configuration by putting 10 charges at the left, then 1 at the right,
then 100 more at the left, and 1 again at the right, and so on, the net force will be di-
rected toward the left. But if we had a preponderance of charges at the right while
setting up the final configuration, then the final force would be directed toward the right.
The foregoing behavior is perhaps not very surprising. What is surprising is that a
rearrangement such as this will always give the same value provided the series m ques-
tion is absolutely convergent. For example, let the configuration consist of unit positive
charges P at the points x * 1, —2, 3, —4, 5, —6, . . ., so that the force is given by the
absolutely convergent series
1 1 1
1 -i5 + p-?+'-- +
(- 1 )”
In this case, as we shall show, the force does not depend on the way in which the final
configuration was reached. 1
The preceding examples may assist the reader to appreciate the following
theorem, which describes what is perhaps the most important property of
absolute convergence.
Theorem III. The terms of an absolutely convergent series may be re-
arranged in any manner without altering the value of the mm.
1 One may say that the “charges at infinity" now have no influence, whereas in the
former case (6-2) they were important.
THE GENERAL THEORY
SEC. 6]
131
We establish this result first for series of positive terms. Let 2p* be
such a series and Xp k a rearrangement. For every n we have
fc-1 Jfc-1
inasmuch as each term p k is to be found among the terms of 2p*. Hence
2 p k converges (by the fundamental principle), and also
Spi < 2 p k .
In just the same way we find 2 p k < 2pi, and hence 2p* = 2p*.
To obtain the result for an arbitrary but absolutely convergent series
2a* f denote the rearrangement by 2a* and observe that
a k » (a k + \a k \) - |a*|
a k » (a k + |ai|) - |a*|.
(6-6)
By the result for positive series we have
2|<4l - S|a*|
2 (a k + | ail) == 2(a* + |a*|).
Hence (6-6) gives 2a* = 2a* when we recall (2-3).
By methods quite similar to the foregoing 1 one can establish the follow-
ing, which expresses a third fundamental property of absolutely convergent
series:
Theorem IV. If 2 a k — a and 26* ~ b are absolutely convergent , then
these series can be multiplied like finite sums and the product series unU con-
verge to ab. Moreover , the product series is absolutely convergent , hence may
be rearranged in any manner . For example,
ab = aibi + (a$bi + a^) + (a^bi + 0262 + 0163) + • • *.
Example: Consider the series 2j n / \f 1 .
With a* x n /y/n we have
|gn-fil _ 1 x n + l Vn |
I On I I Vn + 1 X n I
1*1
Hence the series converges absolutely if |ac| <1 and diverges if |x| > 1. To see what
happens when x ■* dbl, we substitute these values into the original series, obtaining
2
(~l) w
Vn
for x « —1 and x «* +1, respectively. The first series is conditionally convergent,
and the seoond is divergent. Hence the series converges absolutely when [x| < 1, it
converges conditionally when x — — 1, and it diverges for all other values of z.
1 The proof is given in full in Sokolnikoff, op. cit., pp. 242-244.
VS2
INFINITE SERIES
(CHAP. 2
PROBLEMS
1. Classify the following series as absolutely convergent, conditionally convergent, or
divergent:
‘(HIT V, lv , g"+ 8 V HL‘ v w 1
' Vn’ 2 n" ' " 7? ’ w 2 V 3
3*6 3-6-9
1 - 3 • 5 • 7
+ •
3*6*9*12
2. Determine the values of x for which the following series are absolutely convergent,
conditionally convergent, or divergent:
S(-l)" S(-l) n ff y 2(-l )V, 2 L, S - (-i. r --)"2n!x" ( 2
n (2n) f nx n n \i + 4/ log (n -f 1)
eo
3. Approximately how many terms of the series JZ ( — l)”/?i 4 are needed to give the
i
sum within IQ" 8 ? Evaluate the sum to two places of decimal*
7. Uniform Convergence. If a finite number of functions that are all
continuous in an interval 1 [a y b] arc added together, the sum is also a con-
tinuous function in [a y b\. The question arises as to whether or not this
property will be retained in the case of an infinite series of continuous
functions. Moreover, it is frequently desirable to obtain the derivative
(or integral) of a function fix) by means of term-by-term differentiation
(or integration) of an infinite series that defines /( j). I’nfortunately such
operations are not always valid, and many important investigations have
led to erroneous results solely because of the improper handling of infinite
series. The analysis of these questions is based on a property known as
uniform convergence , which is now to be described.
CO
If a series of functions u n(?) converges for each value of x in an inter-
n S3* 1
val [a, 6], then the sum defines a function of x,
s(x) *= Si4 n (x).
We denote the nth partial sum by *s„(x),
*»(*) = Ui(x) + U 2 (x) + V H (x) -f 1- Unix),
and the remainder after n terms by r n (x):
r n (x) = s(x) - 8„(x) =* Mn+lW + U ni *{*) H * (7-1)
Since the series converges to s(x), lim s n (x) ~ s(x) as n oo, and hence
lim r n (x) = 0. (7-2)
The statement embodied in (7-2) means that for any preassigned positive
number e, however small, one can find a number N such that
|r n (x)| < e for all n> N.
1 We use [a, b] to indicate the closed interval a < x < 6.
SBC. 7J THE GENERAL THEORY 133
It is important to note that, in general, the magnitude of N depends not
only on the choice of e but also on the value of x.
This last remark may be clarified by considering the series
Since
x -f (x ~ l)x -f (x - t)x 2 H {- (x - l)x w ~ 1 H ,
a n (x) + - l)x 2 4 -f (x - l)^” 1 « x n ,
it is evident that
lim 8 n (x) ££ lim x n * 0, if 0 < x < 1.
n — * « n a o
Thus, *(x) — 0 for all values of x in the interval 0 < x < 1, and therefore
|r n (x)| «* |a„(x) - s(x) | « |x n - 0| * x n .
Hence, the requirement that |r n (x)| < <, for an arbitrary e, will be satisfied only if
x n < «. This inequality leads to the condition
n log x < log e.
Since log x is negative for x between 0 and 1, it follows that it is necessary to have
. log «
n >
logx
which clearly shows the dependence of N on both <= and x. In fact, if * = 0.01 and
x — 0.1, n must be greater than log 0 01 /log 0 1 » — 2/( — 1) - 2, so that N can be
chosen as any number greater than 2. If € = 001 and x *= 0 5, N must be chosen
larger than log 0.01 /log 0.5, which is greater than 6 Since the values of log x approach
zero as x approaches 1, the ratio log */log x will increase indefinitely and it will be im-
possible to find a single value of N which will serve for e *** 0.01 and for all values of
x in 0 < x < 1.
This is the situation which is to be expected in general. In many impor-
tant cases, however, it is possible to find a single, fixed N, for any preas-
signed positive e, which will serve for all values of x in the interval. The
series is then said to be uniformly convergent.
Definition. The series Zu n (x) is uniformly convergent in the interval
[a,b] if for each t > 0 there is a number N\ independent of x, such that the
remainder r n {x) satisfies )r n (x) j < e/or all n > N.
It is the words in boldface type that give the whole distinction between
ordinary convergence and uniform convergence.
To illustrate this distinction in a specific case, we shall discuss the geometric series
SO
53 £ n on the interval — J 2 < x <
n<4
According to the result of Sec. 1, Example 2, the sum, partial sum, and remainder
are, respectively,
J J
a(x) * 1 S n (x) * ~ > r„(x) « (7-3) .
I— X I X 1— “X
134 INFINITE SERIES [CHAP. 2
The condition |r„(x)| < « gives \x n \ < #(1 - x) or, upon taking the logarithm and
solving lor n,
n> log«(I-x) (M)
log \x\
Again it appears that the choice of N depends on both x and e, but in this case it is
possible to choose an N that will serve for all values of jin ( — HI Given a small €,
the ratio log «(1 ~ x)/log |x| assumes its maximum value when x ** -\~Vz- Hence if
N is chosen so that
N log */ 2 _ log €
^ log Vi log 2
then the inequality (7-4) will be satisfied for all n > N.
Upon recalling the conditions for uniform convergence, we see that the series 2x n
converges uniformly for — < x < Y- However, the series does not converge uni-
formly in the interval ( — 1,1), for, in this interval, the ratio appearing in (7-4) will in-
crease indefinitely as x approaches the values ± 1 .
Generally speaking, any test for convergence becomes a test for uniform
convergence provided its conditions are satisfied uniformly, that is, inde-
pendently of x. For instance, the ratio test takes the form: If there is a
number r independent of x such that for all large n
Un+ i(s)
u n (x)
then Zu n (x) converges uniformly. Similarly, the comparison test takes
the form: If 2 v n (x) is a uniformly convergent series such that |w n (x)J <
v n (t), then 2 u n (:r) converges uniformly. The simplest example of a uni-
formly convergent series 2 v n (x) is a series of constants. Choosing such a
series in the comparison test, we are led to the so-called Weierstrass M test :
Theorem I. If there is a convergent series of constants , 2Af n , such that
| u n (x) [ < M n for all values of x on [a,b\, then the series 2 u n (x) is uniformly
(and absolutely) convergent on [a f b\.
The proof is simple. Since 2 M n is convergent, for any prescribed e > 0
there is an AT such that
< r < 1,
M n+i *T Af n 4.2 + M n + 3 + * * * <6 for all n > N.
By the ordinary comparison test 2 u n (x) converges for each x, so that r n (x)
is well defined. We have, moreover,
|r»(x)| = K+i(x) + u n+ 2 (x ) H 1 < |w„+i(x)| + |u„ +2 (x)| H
< M n + 1 + M n+ 2 H — • < t
for all n > N. Since N does not depend on x , this establishes the theorem.
The other tests for uniform convergence mentioned above are established
similarly.
THE GENERAL THEORY
135
SBC. 7]
The fact that the Weierstrass test establishes the absolute convergence,
as well as the uniform convergence, of a series means that it is applicable
only to series which converge absolutely. There are other tests that are
not so restricted, but these tests are more complex. It should be empha-
sized that a series may converge uniformly but not absolutely, and vice
versa.
* sin yix
To illustrate the use of the M test consider the series — r — Since | sin nx I <1
n»l ^
for all values of x , the convergent series Z 1 /»* will serve as an M series. It follows that
Z(sin nx)/n 2 is uniformly and absolutely convergent on every interval, no matter how
large.
For another example consider the geometric series Zx n . In any interval [—a, a] with
0 < a < 1 the series of positive constants Za" could be used as an M senes, since
|x n | < a n on the given interval and since Za" converges
The importance of uniform convergence rests upon the following
theorems :
Theorem II. Let 2u k (x) be a series such that each u k (x) is a continuous
function of x in the interval [a, 6]. If the series is uniformly convergent in
[a, 6], then the sum of the series is also a continuous function of x in [a, b].
Theorem III. If a series of continuous functions 'EUn(x) converges uni-
formly to s(x) in [a, 6], then
rp rP rP rP
/ s(x)dx~ Ui(x)dx+ u 2 (x) dx 4 h/ u n (r)dx ~ | ,
Ja Ja Jet J a
where a < a < b and a < (3 < b. Moreover , the convergence is uniform with
respect to a and (3.
Theorem IV. Let Xu k (x) be a scries of differentiable functions that con-
verges to s(x) in [ a,b ]. If the series Zi4(x) converges uniformly in [a, 6], then
it converges to s'{x).
The proof is not difficult, and serves well to illustrate the idea of uniform convergence
(see words in boldface). In Theorem II, if x and x 4* h are on (a, 6], we have
«(x) * 8 n (x) + r n (x),
six 4- h) ** s n (x 4- h) 4- r n {x 4* h),
and hence
+ h) six) ~ 8nix 4 -h) - 8nix ) + r n (x 4 - h) - r n (x). (7-5)
Given « > 0, pick n so that |r«(() i < « for all t on [a, b]. Now, s«(x) is a finite sum of
continuous functions, hence continuous. Therefore
|«n(x 4 -h) - s n (x) | < «
whenever |A] is sufficiently small. From (7-5) it follows that
|*(x 4* h) — a(x)| < |a„(x 4* h) - s«(x)| 4* |r„(x + A)| 4- k«(x)|
< c 4- « 4* «
INFINITE SBKIES
186
[chap. 2
This shows that | s(x 4* h) — s(x) | becomes arbitrarily small provided | h | is sufficiently
small, and hence a(x) is continuous.
For Theorem III, note that *(x) and r n (x) are continuous by Theorem II. Hence
J s*{x) dx « J s n (x) dx 4
hat \r n (x) | < e
| j s(x) dx — J s n (j) dx j < j J t dx
fr.,
Jet
(x) dx.
If we choose n so large that \r n (x) | < e for all x on [a,b], then
r& rt*
|j9 — a | c < (6 — a)e.
Since the finite sum $ n (.c) can be integrated term by term and aim e (6 — a)t is arbitrarily
small independently of a and 0, the desired result follows Theorem IV follows from
Theorem III when u' k {x) is continuous; 1 we simply write down the differentiated series
and integrate term by term.
A geometric interpretation of uniform convergence may he obtained by
considering the graphs of y «= s(x)
and of the nth approximating curves
y — s n (x). The condition | (rr) j
< € is equivalent to
s(x) — t < s n (x) < s(x) + c (7-6)
which means that the graph of
y — s n (x) lies in a strip of width 2c
centered on the graph of y = s(x)
(see Fig 4). No matter how narrow
the strip may he, this condition
must hold for all sufficiently large n;
otherwise the convergence is not uniform.
With such an interpretation, many facts about uniform convergence
become rather obvious. For example, the conclusion of Theorem III is
rfi {0
/ s(x) dx = lim / s n (x) dx
Ja n —* aa da
(7-7)
and the truth of (7-7) is strongly suggested by considering appropriate
areas in Fig. 4.
A graphical illustration of nonunijorm convergence is given in Fig. 5,
Here, the partial sums
n 2 x
1 + ft 3 * 2
are plotted for n = 3, 5, and 10. By inspection of (7-8)
s(x) ** lim s n (x) *» 0, — oo < x < oo.
(7-8)
1 A proof free of this restriction is given in K. Knopp, “Theory and Application a I
Infinite Series/’ p. 343, Blackie <fe Son, Ltd., Glasgow.
THE GENEIUL THEORY
13?
SEC. 7]
Nevertheless the approximating curves (7-8) have peaks near x * 0 which
grow higher with increasing n. Since y = s„(t) does not lie in a strip
— c < y < € for arbitrarily small 1 € arid all large n, the convergence is not
unilorm in any interval containing the point x = 0.
By looking at Fig 5 one cannot easily sec whether the areas under the
curves y ~ s n (x) tend to 0 or not; that is, one cannot tell whether (7-7)
holds or not A short calculation based on (7-S) shows that, in fact, (7-7)
does hold Thus, the coin lusinrt of Tht oreni III nnt\ be true even when
the convergence is not uinfoiin It i-< lett tor the student to verify that
(7-7) does not hold when a - 0, 8 ~ 1. and, instead of (7-8),
6 n (x) = -
1 4- iru J log n
(7-9)
The graphs of y = & H (x) in (7-9) give a figure quite similar to Fig. 5.
PROBLEMS
1. rhf» pupal sum- of i senes an s/r) ■* r” Show tint the series is uniformly
<ouvergml m the mtuvd [(), 1 ' 2 I
2. By using tin* definition of umfoim <onveiR< me, show that
1 __ 1 _
j + 1 lx + l)tr -f 2) (x 4- n - l)(.r 4 n)
1 In this caw the condition does not even hold for laige values of #.
138
INFINITE SERIES
{chap. 2
Is uniformly convergent in the interval 0 < x < 1. Hint: Rewrite the series to show that
s n (z) «* \/{x -f n) and therefore *»(x) — s(x) * l/(x + n). See Prob. 1* Sec. 1.
3 . Test the following series for uniform convergence:
S 2(10*)", Sn(Bin *)», 2
4. Test for uniform convergence the series obtained by term-by-term differentiation
of the four series given in Prob. 3.
5. Plot the sequence s n (x) » nx/( 1 -f nx) versus x for 0 < x < 1 and for n «= 10,
100, 1,000. Does lim s n (x) ** s(x) exist for every r? Is the convergence uniform on
0 < 1?
Is s(x) continuous? Does lim
J s n (x) dx
dx for all a, 0 on [0,1]?
6, If * n (x) * 2nxe~ n **, 0 < x < 1, show that
lim / s n (x) dx — / lim s„(x) dx « 1.
n -> <*> J o Jo n
Is the convergence «»(x) — » s(x) uniform?
Problem for Review
7 . Show that 2o« converges absolutely if lim |a«| r < 1. //in/; Choose r' so
that r < r' < 1. Then V^|a n | < r' for sufficiently large n, and hence |a„| < (r') n .
POWER SERIES AND TAYLOR’S FORMULA
8. Properties of Power Series. One of the most important types of
infinite series is the power series 1
QO
]£ a n x n = oq + a x x + a ** 2 H h a n x n (8-1)
so called because it is arranged in ascending powers of the variable. Typi-
cal examples are given by the three series 2
2x n nl, * 2>x n , (8-2)
n!
which were already encountered in the foregoing sections.
For many power series the region of convergence is easily determined by
means of the ratio test. In the first series (8-2), for instance, the ratio of
two successive terms leads to
x n nl
x n ~ l (n - 1)!
\xn\ » \x\n
for x 7 * 0
1 Throughout Secs. 8 to 14, X means X rather than 2*
o 1
1 It is customary to take 0! *» 1, so that the relation n! ** n(n — 1)1 will hold for
• las well as for n » 2, 3, 4,
SEC. 8} POWER SERIES AND TAYLOR’S FORMULA 139
and hence the series converges only for x * 0. In just the same way it is
found that the second series gives a ratio \x\ /n, which approaches zero.
Hence the second series converges for all x. The third series is the geo-
metric series, which, as we know, converges for \x\ <1.
It is a remarkable fact that every power series, without exception, be-
haves like one of these three examples. The series converges for x « 0
only, or it converges for all x, or there is a number r such that 1 the series
converges whenever \x\ < r but diverges whenever | x | > r. The number
r is called the radius of convergence , and the interval |x| < r is called the
interval of convergence. The fact that every power series has an interval of
convergence may be deduced 2 from the following theorem:
Theorem I. If I,a n x n converges for a particular value x = x Qf then the
series converges absolutely whenever |x| < |jr 0 | arid uniformly in the interval
\x\ < | Xi\ for each fixed Xi such that |x l( f < |x 0 | And if it diverges for
x = £o, then it diverges for all x such that |u*| > | x 0 1 -
To establish Theorem I, observe that lim a n Xo = 0, since Xa n XQ con-
verges (Sec. 1, Theorem 1). Hence | a n xS | <1 for all sufficiently large n, or
1
|a n | < for all n > N, say.
I X 0 i"
(8-3)
This shows that 2|a n | |.r| n converges by comparison with the geometric
series
provided |j| < |.r 0 |. The statement concerning uniform convergence is
established by the same calculation, since w( I .r } j ' [i*o|) n serves as an M
series for the Weierstrass M test. Finally, the statement concerning diver-
gence follows from the lesult on convergence. That is, if the series con-
verged for x, it would have to converge for .r ( >, since |x 0 | < \x\, and this
is contrary to the hypothesis
The uniform convergence mentioned in Theorem I shows that a power
series represents a continuous function for all values of x interior to its
interval of convergence (see Theorem II, Sec. 7) For instance, 2x n =
1/(1 — x) is continuous for \x\ < 1, though not at x = 1. We shall soon
see that such functions not only are continuous but have derivatives of all
orders and the derivatives can be found by termwise differentiation of
the series.
1 For simplicity of nomenclature one may incorporate the first two cases into the third
by allowing r « 0 and r * *>. The case r « 0 arises when the series converges for z « 0
only, whereas r *= «© if the series converges for all x.
* A complete discussion is given in Sec. 16.
140
INFINITE SERIES
[CHAP. 2
As an illustration of this faot consider the geometric series Xx n mentioned
above. Term-by-term differentiation yields the series 2 nx n ~~ l . Because
of the coefficient n, which tends to infinity, one might expect the latter
series to have a smaller interval of convergence than the former. Actually,
however, the intervals are the same. Since
nx n ~ l
;
1 71
(n - l)x n ~ 2
X
[ n — 1
| x | , as n — > oo,
the ratio test shows that the differentiated series, like the original series,
has the interval of convergence |x| < 1. A similar result is found if we
differentiate repeatedly. Each differentiation multiplies the ratio by
n/(n — 1). Inasmuch as n/(n — 1) — * 1, this factor does not change the
limit of the ratio, hence does not change the interval of convergence.
For many power series the ratio |a„+j/a n | has no limit as n —► <x>, and
the foregoing analysis does not apply. However, suppose the series (8-1)
converges for some value x — x Q ^ 0 , so that, as before, we have the esti-
mate (8-3). If \x\ < |x 0 |, the differentiated series 2na n x n-1 converges
by comparison with
(Note that the latter series was shown to be convergent in the previous
paragraph.) The same calculation establishes uniform convergence of the
derivative series if \x\ < |xj| < | x 0 1 , since
y
n
serves as an M series for the Weierstrass M test. Hence, the result of the
differentiation is actually the derivative of the original series Ha n x n (see
Theorem IV, Sec. 7).
The foregoing argument is practically identical with that used to prove
Theorem I. A third use of the same method establishes the corresponding
result for the integrated series 2a n x n+1 /( n + I)- In this case the compari-
son series are, respectively,
, i /My
n + 1 \ | Xq j /
and
n + 1 \ | x 0 1 /
Summarizing this discussion we can state the following, which is perhaps
the most important and useful result in the whole theory' of power series:
Theorem II. A power series may be differentiated (or integrated) term
by term in any interval interior to its interval of convergence . The resulting
series has the same interval of convergence as the original series and represents
the derimtive (or integral) of the function to which the original series converges .
141
SEC. 8] POWER SERIES AND TAYLOFt’s FORMULA
Consider, for example, the geometric series
(l-xr'-l+i+r 5 ! + z n 4 , |z| < 1. (8-4)
Differentiating termwise we obtain
(1 - x)" 2 *» 1 + 2x + 3x 2 4 f ax"" 1 + • * ■, |x| < 1. (8-5)
Differentiating again gives an expansion for (1 — x) ~ 3 , and so on Since the series (8-4)
converges for |x| < I, Theorem 11 shows without further discussion that all these other
expansions are also valid for |xj <1
On the other hand, if the series (8-4) is integrated termwise from zero to x, there results
an expansion
x 2 x 3 x n
-log (1 ~ x) * x + ~ + - 4 h i , |x| <1, (8-6)
z o n
which can lie used for numerical computation of the logarithm,
liquations (8-4) to (8-0) give power-series representations for the func-
tions on the left. It will now be established that such representations are
always unique
Theokem III. 7/ hvo power writs conirrgt fo the same sum throughout
an interval , thm corresponding co* ffn tents are ajual.
For proof, assume that S«„.r w — so that, by (2-3),
0 = (do — bo) + («i — l>i)x + (ti 2 — b>).r +■*••+ (u n — b n )x tl + • * *.
The choice x *= 0 yields a 0 — bo Differentiating with respect to x yields
0 ~ (ai - bi) + 2(a 2 — b 2 )x H {- n(a n - b n )x n " l H
and if we now set x — 0, wo get a\ - /q. Fpon differentiating again and
setting x - 0, wo get a 2 ~ bj , and so on.
This process not only shows that the coefficients are uniquely determined
but yields a simple formula for their values Let
f(x) * a 0 + ciiX + a 2 x z + • • + a n x” 4 , for \x\ < x 0 .
Upon differentiating n times we get
f (n) (x) -0+0 + 04 + 0 + //!«„ 4
where the second group of terms “4 — ■" involves x, x 2 , or higher powers.
These terms disappear when wo set x = 0, and lienee / fT °(0) - n\a n , or
/ (r,) ( 0 )
(8-7)
In the following section wo shall be led to the same formula (8-7), though
by an entirely different method
The algebraic properties described in See. 2 for series in general give
corresponding properties for power series: Two power series may be added
term by term, a power series may be multiplied by a constant, and so on.
INFINITE SERIES
142
(chap. 2
Since power series converge absolutely in the interval of convergence, The-
orem IV, Sec. 6, yields the following additional property:
Theorem IV. Two power series may be multiplied like polynomials for
values x which are interior to both intervals of convergence. Thus,
(SOnX W )(S6 n X n ) - XCnX",
where c n = ao&» + afin^i + 02^—2 + * • * + a n b 0 .
So far, nothing has been said about the behavior of power series at the ends of the
interval of convergence. As a matter of fact all behaviors are possible. For example,
each of the series
(~-D n * n
n
has \x\ < 1 as interval of convergence. However, the first series converges at x « 1
and —1, the second diverges at x * 1 and —1, the third converges at x — —1 but
diverges at x « X, and the fourth diverges at x = —1 but converges at x ** 1.
For applications, the most important theorem concerning the behavior at the ends
of the convergence interval is Abel’s theorem 1 on continuity of power series, which
reads as follows:
Asia’s Theorem. Suppose the power series XanX n converges for x «* xo, where xq may
be an end point of the interval of convergence. Then
lim Sa„x n «* 2a n x"
* *0
provided x — ♦ Xo through values interior to the interval of convergence
To illustrate the theorem, let x — * —1 through values greater than —1 in the series
(8-fi), The limit of the left side is — log 2, since the logarithm is continuous, and the
limit of the right side is
l
(~ir
by virtue of Abel's theorem. Hence,
log 2
1 1 1 (- l)"* 1
1 -~5 + o-i+‘*- + L — — +••••
2 3 4 n
As another example of Abel’s theorem, let x — ► 1 in Theorem IV to obtain the fol-
lowing: //
Cft * acfan -f Q>lbn-~l H h u n 6o,
then (Za n )(£hn) ■* Xc n provided each series is convergent. Hence, with the particular
arrangement of the product series which is given by Xc n , we do not need absolute con-
vergence as in Theorem IV of Bee. 6.
PROBLEMS
1. Find the interval of convergence, and determine the behavior at the end points
of the interval:
T 'ln r 2n +1 <n. 3 r 2n
2 (—
1 A proof is given in I. S. Sokolmkoff, “Advanced Calculus/’ pp. 278-279, McGraw-
Hill Book Company, Inc., New York, 1939.
SEC. 9] POWER SERTE8 AND TAYLOR*S FORMUtA 143
3. Show that the radius of convergence of 2o„x n is given by r ~ lim \a n /a n +\ t
n oo
whenever this limit exists.
3. (a) By letting x ** — / 2 in (8-4), obtain the expansion
— - 1 -< 2 + * 4 - 1 6 +---+(-\) n t u + •••.
1 + t l
(6) By integrating from zero to a:, obtain an expansion for tan
suit (b), show that
(c) Using your re-
4 . (a) Show that the series y « Xx n /n ! satisfies y f * (5) Deduce an expansion for
e r . For what values of x is this expansion valid 7 fc) Obtain series expansions for e and
1 /e by taking x » =bl m (b) (d) Using youi sene's, compute e and l/e to three signifi-
cant figures, and cheek your work by finding the pioduct c(l/e).
6. Using results given in the text, express the following integials as power series:
r.A , r
Jo 1 + t i Jo
log (1 4 U) dt,
L
U dt
o d~~~ flj*'
Hint • In the third ease, for example, let x - t z in (8-5), multiply through by f 6 , and
finally integrate term by term
6. By multiplication of series obtain the expansion of (1 + .r -f x 2 -f • • * + x n + * * -) 2 .
In particular, compute the coefficients of 1, x, x 2 , x 3 , and x" in the product series.
9. Taylor’s Formula. The usefulness oi power series is greatly increased
by the so-called Taylor formula , which yields the power-series expansion for
an arbitrary function f(x) together with an expression for the remainder
after n terms Let /(.r) be a function with a continuous nth derivative
throughout the interval \a,b]. Taylor’s formula is obtained by integrating
this nth derivative n times in succession between the limits a and x, where
x is any point on [a,b]. Thus,
/> V) dx = = f n ~ u (s) — / (, ‘ _1> (fl)
Ja I a
j Z idx) 2 = f X f n ~ l) (x) dx -fj {n ~ l) (a) dx
= f (n ~ 2 \x) -f n ~ 2 \a) - (x - rt)/ (n_I) ( a )
f X f [>>(*) ( dx )* = f (n ~‘ 6) ix) -/ ( ’- 3, (a) - (x- a)f {n ~ 2) (a)
Ja Ja Ja
-
2 !
/;••• jj (n \x) (dx) n - f{x) - /(a) - (x - a)f'(a) - (j ~ f"(a)
(x - a)"~ l
-f in ~ l> (a).
in - 1)!
[chap. 2
144 INFINITE SERIES
Solving for f(x) gives 2
/(*) - /(a) + (* - a)f'(a) +
2 1
(x — a )”"" 1
H 777 f {n ^\a) + R n) (9-1)
(n — 1)!
where R n «= f . . . f / (n) (*) (dr) n . (9-2)
The formula given by (9-1 ) is known as Taylor’s formula, and the particu-
lar form of R n given in (9-2) is called the integral form of the remainder after
n terms. The Lagrangian form of the remainder, which is often more use-
ful, is
(x — a) n
Rn = 7-/ (n, (f), a < £ < r. (9-3)
n!
To derive this from the form (9-2), let M be the maximum and m the minimum of
f (n) (l) for a < t < x. Then the integral (9-2) clearly lies between
f ... f M (dx) n and f ... f m ( dx ) n .
J a J a J a J a
Upon carrying out the integration we find that these bounds are
Or - a) n
n t
M and
(r - a) n
- m,
respectively. Since the continuous function assumes all values between its maxi-
mum M and minimum m , there must be a number t » £ such that (9-3) holds We
have written our inequalities for the case a < x; in any case, £ is between a and x.
In general the remainder R n depends on x, as is obvious from the repre-
sentation (9-2). It may happen, however, that f(x) has derivatives of all
orders and that the remainder R n approaches zero as n — > 00 for each
value x on \a,b]. In this case \\e obtain a representation for f(x) as an in-
finite series
/(*)
^ f in) (a)(x - a) n
n sa*0 ^ •
(9-4)
and R n now gives the error which arises when the series is approximated by
its nth partial sum. The series in (9-4) is called the Taylor scries for f(x)
about the point x = a. The special case
m «
* / (n) ( 0)x B
h; »!
(9-5)
is often called Maclaurin’s series y though Taylor’s work preceded Mac-
laurin’s.
SEC. 9] POWER SERIES AND TAYLOR’S FORMULA
To illustrate* the use of Taylor’s formula, lot /Or) *■» e*. Then f\x)
* and hence / (w) (0) ** 3 Equation (0-5) suggests that
e x
r L x* x”
145
f"(x) * «*
(0-0)
and indeed, bv the ratio test this series converges for all x, However, to show that it
converges to c x we must consider the remainder, which takes the form
Rn
0 < t < T,
(9-7)
when we use (9-3). Since this approaches zero 1 as n -* <*, the aeries docs converge to e*.
As anot her example we find the expansion of cos x m powers of x - (r/2). The values
of /, f", are, respectively,
cos x, — s in t, —cos x , sin x. (9-8)
Sime the next term is / 1V ** cost, the next tour derivatives repeat the sequence (9-8),
the next four repeat again, and so on Evaluating at x — tt, 2 w e get, respectively,
0, -1, 0, 1, 0, -1, 0, l; 0, —1,0, 1 , . .
and hence Kq. (9-4) suggests the expansion
<*W*)
To determine if the senes conveiges to the Junction on the left, we consider the re-
mainder after n terms Now, (0-8) gives f yn) (x) — -t mm x or ri cost, so that (9-3)
implies 1 R n | < u — it 2| w , n ’ Since bm R n - 0, the expansion (9-9) is valid
Upon setting t ** ir y 2 — t and noting that ros(jr 2 — /) — s m/, wm get an expan-
sion lor sin £
sin t * t
t* r t 1
— — — T
3* .V 7*
* 4- - - * + •
(2a -h 1)'
(9-10;
which is consistent with (9-5) It is left as an exercise lor the reader to obtain a similar
expansion for the cosine by us* of (9-5) and (9-3).
cos t = 1
+
(2«) » “ +
(9-11)
In these examples the fact that the series converges to the function was
established by direct examination of R n . Such examination is necessary
even when the series is found to be convergent by other means. For ex-
ample, if we define
rx) = m = o ,
it (‘an be show n that the Taylor series about x = 0 converges for all x but
converges to /Or) only when x = 0. The trouble with this function is that
it does not admit any power-series expansion valid over an interval con-
taining x - 0, and we have the following:
1 The fact that (9-6) converges shows that x n /nl — * 0. (Cf. Prob. 5, Sec. 5.)
146
INFINITE SERIES
[CHAP. 2
Theorem I. Suppose a function f(z) admits a series representation in
powers of x — a, so that f(x) * Xa n (x — a) n for some interval |x — a\ < c.
Then the Taylor series generated by f(x) coincides with the given expansion
[and hence, the Taylor series converges to f(x)].
For proof, differentiate 1 n times and set x * a, just as in the discussion
of (8-7). It will be found that a n — f in) {a)/n \ , and hence the given series
is identical with the Taylor series.
Theorem I shows that a valid power-series expansion obtained by any
method whatever must coincide with the Taylor series. For instance, to
find the Taylor series for sin x 2 about x = 0, we set t =» x 2 in (9-10). This
is far simpler than direct use of Taylor's formula, as the reader can verify.
Example: Obtain the expansion of
fix)
1
tx - 2 )(x - 3)
in powers of x — 1.
With t » x — 1, the given function becomes
1 -1 1 111
(/ - 1)(! - 2) ” t - l + t - 2 " X - < “ 2 1 - Mi
(9 - ,2 >
when we use partial fractions and the known formula for sum of a geometric series.
Upon recalling that t «* x — 1, we get the required result
IW3)
Since the two geometric series (9-12) converge for \t\ <1 and |<| <2, respectively,
the expansion (9-13) is valid for \x — 1 1 < 1. By Theorem I, this expansion coincides
with the Taylor series.
PROBLEMS
1. For the following functions find the Taylor series about the point x ** 0 and also
about the point x — 1:
e 2z t sin vx t cos (x — 1), 2 -f x 2 , (x -f* 2)~ l .
2 . (a) Expand e x about the point x * a by writing e x * e a e x ~ a and using (9-6).
(6) Expand logx about x 1 by writing logx *** log [1 — (1 — x)] and using (8-6).
(c) Obtain the general Taylor series from Maclaurin’s. Hint • If g{t) =» f(a -j- t), then
f(a ~M) - ?(*) ~ S^ (n) (0)r/n! - Xf^ n) {a)t n /nl
Now let < « x — a.
1 The fact that the series now considered are in powers of x — a rather than x causes
Ho trouble. By a simple translation of axes, £ » x — a, these series become power series
of the type considered in the preceding section, hence are subject to the theorems of the
preceding section.
147
SEC. 10] POWER SERIES AND TAYLOR’S FORMULA
3. Expand the following fractions about the point x ** 1:
1 1__ I
x x 2 — 4* x(x 2 — 4)
4 . Show that the Taylor series for sin x in powers of x — a converges to sin x for every
value of x and o, and find the expansion of sin x in powers of x — tt/6.
6. Obtain the Maclaurin aeries for cos x by differentiating the series for sin x.
6. By means of the known series for e u , sin u, and log (1 — u), find Taylor's expansions
for
e~ x \ sin x 2 , e x -f e~* x , e x — e“* x~ 2 log (1 -j- x 4 ).
7. What is the Taylor series for (I 4* x) p if p is constant? Find the interval of con-
vergence, and discuss the absolute convergence at the end points of the interval. Hint:
Use Theorem IIfc, Sec. 5. Analysis of the remainder R n is difficult and may be omitted,
A proof that the series converges to (1 -f x) p will be found m Sec. 12.
10. The Expression of Integrals as Infinite Series. Many difficult in-
tegrals can be represented as power series. For example, it* we let x = t 2
in the series (9-11) for cos x , we get
cos r
t A t*
1 + - + ■
2! 4!
and hence, integrating term by term,
/;
cos t 2 dt = x
5-2'
(-l)V
•+• 1 --
(2 n !
+ ••■ +
(-l)V
n „4n + l
(4 n + 1 )(2n)!
+ •
( 10 - 1 )
This integral is called the Fresnel cosine integral; it is important in the theory
of diffraction. Although the Fresnel integral is not expressible in terms of
elementary functions in closed form, the expansion (10-1) is valid for all x
and gives a representation which is entirely adequate for many purposes
Sometimes one may obtain a power series involving a parameter rather
than the variable of integration as in the last example. To illustrate this
possibility we shall express the arc length of an ellipse as a power series in
the eccentricity /c. If the equation of the ellipse is given in parametric form
as
x = a sin 0, y — b cos 6 , a > 6,
then the arc s satisfies
ds 2 = dx 2 + dy 2 — (a 2 cos 2 6 + b 2 sin 2 0) d$ 2 .
Upon noting that cos 2 0=1— sin 2 0, we obtain
ds — a VT^T 2 sin 2 $ d$,
where k «* (a 2 — b 2 ) **/ a is the eccentricity. Hence, the arc from 0 « 0 to
6 « 4>is
8 == a f Vi — k 2 sin 2 0 dd ss aE(k
Jo
148
INFINITE SERIES
[CHAP* 2
The integral E(k,<t>) defined by this equation is called the elliptic integral of
the second kind . Although E(k,<t>) is not elementary, it may be expressed as
a power series.
By the binomial theorem (Sec. 12)
(1 - k 2 sin 2 e)K » 1 - ]/ 2 k 2 sin 2 B - y s k 4 sin 4 B (10-2)
for k 2 < 1, which is the case when 6 ^ 0. Since
i + yk 2 + y$k 4 + • • •
serves as an M series, (10-2) is uniformly convergent and term-by-term
integration is permissible. Hence we obtain the desired expression
1 „ ft n k 4 ft
E(k,<p) « <f> k 2 sin 2 6 dO / sin 4 6 dd — * • •
2 •'o 2-4 A)
1 -3-5 ... (2n - 3)
2-4-6 . . . 2n
In a Bimilar manner it can be shown that the elliptic integral of the first kind (ef
Example 3, Sec. 20, Chap 1),
lias the expansion
1-3 f*
sin 2 0 dd -} k 4 / sin 4 6 do 4* •
2 4 Jq
The elliptic integral of the third kind is
ft
+ r~~ „ — — sm “" 0(19 H •
1-3 5 ... (2n - 1)
2-4~6T7.2n"
U(n,k,4>)
r —
h (1 4- n
dO
sm 2 0)\/l — k 2 sin 2 0
and this, too, can be expressed as a series by expanding the radical.
Any integral of the form
J (a sin x + b cos x + c)^ dx
or of the form
f i?(i,Va^ f &r 3 -f cx 2 + dx + e ) dx, 1 R(x,y) = rational function,
is expressible in terms of the elliptic integrals 1 together with elementary
functions. For this reason elliptic integrals have great practical impor-
tance and have been extensively tabulated.
1 See, for example, P. Franklin, “Methods of Advanced Calculus/’ chap. 7, McGraw-
Hill Book Company, Inc,, New York, 1944.
sue. Ill
POWER SERIES AND TAYLORS FORMULA
149
PROBLEMS
- dt.
1. Expand the following integrals as power series:
dt, f sin (t 2 ) dt, — t
Jo Jo t Jo Jo t
2. Express / e x * iat dt as a powe r series in x. Hint By Wallis' formula,
Jo
I sin” i
Jo
1 1 dt
(n - ])(w — 3)
2 or 1
n(n — 2) ... 2 or 1
where <x ** 2 if n is odd and a * ir if n is even.
3. Express the incomplete gamma Janet ion
f P“V-*
Jo
dt
as a series in powers of x. For w hat values of x and p is your expansion valid?
4. The beta Junction is defined by
B(p
} q) m f'x»-\ 1 -
Jo
I)*- 1 dx.
Express this a& a scries by using the binomial theorem for (1 — x) H ' 1 and integrating
term by term Fur whul values of p and g is the resulting series absolutely convergent?
(See Theorem 116, See 5. Although the range of integration includes the value x «* 1,
w Inch in an t ud point of t he convergence interval, the integration is easily justified. Thus,
one might consider / and let x -* 1 through values less than 1. The desired result
Jo
then follows fiom Abels theorem, Sec. 8 )
11. Approximation by Means of Taylor’s Formula. If a function f(x)
has a convergent Taylor series, then the partial sums of that series can be
used to approximate the function. In this way, calculations of great in-
trinsic complexity are reduced to calcuJat ions mvohing polynomials. The
method is especially important because Taylor’s formula not only gives a
polynomial approximation but gives a moans of estimating the error.
Thus, the remainder R n in (9-2) and (9-3) is precisely the difference be-
tween f(x) and the nth partial sum of its Taylor series.
To illustrate the use of Taylor's series for numerical computation, let
us find sin 10° within =bl0~ 7 The value 10° -= tt/ 18 radian is closer to
zero than to any other value of x for which sin x and its derivatives are
easily found, and hence the expansion is taken about the point x = 0. To
estimate the number of terms required, (9-3) gives
l«n|
/ <n) «) „ *’
— X < —
n! n !
(0.175) n
n!
(1W)
when a — 0, and when we set x = x/18 = 0.175 and recall that sin x to-
gether with its derivatives is lass than 1 in magnitude. The successive
150
INFINITE SERIES
[CHAP. 2
bounds for R n as given by (11-1) may be computed recursively; indeed, the
nth value is obtained by applying a factor (0.175/n) to the preceding one.
For n * 1 the bound (11-1) is 0.175 and the next few are as follows:
Value of n
2
3
4
5
6
Bound for \R n \
1.5 X 10~*
8.8 X 10 -4
3.9 X 10~ 6
1.4 X 10-«
4.0 X 10~ 8
From a list such as this the n sufficient for any prescribed accuracy can be
determined at once. In particular, an accuracy of ±10“ 7 is found if we
take n » 6. Thus
sm
TT / 7T \ 3 1 / it \ 6 1
10 ~ 18 ~ (.18/ 3! + \Ts) 5! + Re ’
where | R$ | < 4.0 X 10~ 8 ; more explicitly, 0 > R& > —4.0 X 10~ 8 , since
/ <6) (£) =k — sin $ < 0. Inasmuch as the next term of the series is zero,
the first six terms are the same as the first seven terms. Hence the error is
also equal to R 7f where
0>R 7 > —1.0 X 10~ e . (11-2)
An improvement of accuracy such as this is to be expected whenever the
series is terminated just before one or more terms with zero coefficients.
In modern computing practice an automatic computing machine is so programmed
that it keeps track of the remainder, which can often be estimated recursively as in
this example. The machine is then instructed to take as many terms as are needed to
make the remainder less than some preassigned amount. This process was illustrated
in the foregoing calculation, where the value n ** G was chosen, not at random, but by
consideration of the desired accuracy.
The reader may have noticed that the series for sin (ir/18) is an alternating senes with
terms decreasing in magnitude. Hence, the estimate for the error (11-2) could have
been found by Theorem II of Sec. 6. Taylor s formula, however, has the merit of apply-
ing to general power series, whether alternating or not.
Many important approximations are obtained by using the first few
terms of the Taylor-series expansion instead of the function itself. For ex-
ample, the formula
k - y"l 1 + (y') 2 r H
for curvature of the curve whose equation is y - f(x) yields
k *=
3 „ 13 5
- ( y ') + (v ') 4
2 Ky 2! 2 2 U
when we use the binomial theorem. The first-term approximation k ££ y" is
sufficient for most applications.
As another example, in railroad surveying it is frequently useful to know
the difference between the length of a circular arc and the length of the
POWER SERIES AND TAYLOR’S FORMULA
151
SBC. 11]
corresponding chord. Lei r be the radius of curvature of the arc AB
(Fig. 6), and let a be the angle intercepted by the arc. Then, if s is the
length of the arc AB and c is the length of the
chord AB, s * ra and c = 2r sin Since
sin x * x 1 — cos £,
3! 5!
where 0 < £ < x, the error in using only the first
two terms of the expansion is certainly less than
x 5 /51. Then,
c = 2 r sin -
with an error less than
2r
/ a or \
\2 ~ 8 - 7 )/
2r
/ a* \ roA
\32 * 1 20/ ~ 1 , 920
Therefore, $ — c = a?r / 24 with an error that is less than ra 5 / 1,920.
Example 1 . For the nonelementary integral / c u * du obtain a polynomial approxima-
tion valid within -i-0 00001 when 0 < x <
According to (9-6) and (9-7),
i 2 x n ~* 1
€
1 +X+~ +•••+7— 0 < { < X.
2» (n - 1)! n!
If we set x * u 2 , this becomes
c u2 - 1 + u 2 + ~ + •••+: — + 0 < { < u 2 ,
2! (n — 1) ! n!
and integrating from 0 to x yields
/>
Jo
du ** x 4 1 1 1
3 5 2! X (2 n - l)(n
f * u 2n
+ Jo ~nJ
du.
To estimate the integral on the right we note that
e* < e tt * < e 3 * 2 ,
since £ < it 2 and u < x. Hence
It follows that if we write
f u ln et du < e xi [ u 2n du
Jo Jo
«e
f « u2 du ^ 22 7~ — '
Jo «**i (2n —
r 2rt-fl
2 » + 1
1 )(n - 1)!
then the error Ran after the term x 2n 1 satisfies
p x a
0 < Ran <
n!(2n + 1)
152
INFINITE SERIES
[chap. 2
For ah approximation valid within dtO.OOOOl when 0 x <J we choose n large
enough to make
- -7 ~~ z < 0 . 00002 .
n!(2» + 1)
Since e* < 1.3, the above condition is satisfied when
n!(2n + l)2 2n+1 > 65,000.
By trial we find that n «* 4 suffices. This choice of n yields an approximation wffiich
is too small by 0.00002 at most. If 0.00001 is added to the approximation, we get
* + ? + £ + S +0 00001
(U-3)
within dbO.00001 when 0 < x < H-
Example 2. Obtain a polynomial approximation for / e* 1 u * dx valid near x «* 0.
Jo
Keeping terms as far as x z and no terms beyond x 3 , we have
-* + (— T>+i(— T>’+-
Hence
1 +* + -x 2 + 0-z s +-
/V**
Jo
dx
X + -+-+.
where the terms omitted involve x 6 or higher powers.
This calculation of the series for e Biux illustrates a principle which is often useful.
Let f(y) * Xb n y n and y = '£a n x n be power series w r ith nonzero radii of convergence.
If y *» 0 when x ** 0, then the power senes for f(y) as a function of x also has a nonzero
radius of convergence. This series may be found by substituting the series for y into the
series for f(y) and collecting terms. 1 * By uniqueness, the series so obtained is the Taylor
series.
PROBLEMS
1. It is desired to approximate a function f{x) by a polynomial p{x),
p(x) * oq + aix -} h anz n .
in such a way that at the origin p(x) has the same value and the same first n derivatives
as f{x). (a) How should the coefficients be determined? Hint: Oq «* p( 0) * /( 0), ai «*
p f ( 0) « /'( 0), 2a2 *■ p"(0) * /"(0), (6) If the coefficients are determined as m (a),
what relation does p(x) have to the Maclaurin series for /(x)?
2 . For the following functions obtain a polynomial approximation valid near x » 0
by finding the first three nonzero terms of the Maclaurin series:
tan x, e 1 * 0 * sec x t
e* sin x
1 -f e*’ e* ~ l'
1 For proof, see K. Knopp, “Theory and Application of Infinite Series, 5 ' p. 180, Blackie
k Son, Ltd., Glasgow, 1928.
163
SEC. 12] POWER SERIES AND DIFFERENTIAL EQUATIONS
3. (a) By means of series compute x — 10 sin (x/10) to three significant figures when
x ** 1.000. (6) Attempt the same calculation by using a table of sin x (note that a; is in
radians). How many significant figures for sin x are needed?
4. If y « 10 (tan x — x)/x % , (a) use series to evaluate y near x *» 0. In particular,
what is the limit of y as x — ► 0? (ft) Plot y versus x for 0 < x < 0.2. (c) Discuss the
construction of such a graph by use of a table of tan x r without series.
6. By use of series compute to three places of decimals:
(a) e 1 1 - ee 0,1 * 2.7183e° l ; (6) coaJ0° * cos Or/18);
(r) sin 33° - sin (30° + 3°); (d) ^35 * 2(1 + J$ 2 ) w .
6. Evaluate by series the first three integrals to three places of decimals and the last
to two places:
sin x dx f° A log (1 - z)
[ sin (x 2 ) dx } [
Jo Jo
j
Jo
dz ,
VT — x a Jo
J (2 — cos x)~ H dx ** J ^1+2 sin 2 dx.
7. Determine the magnitude of a if the error in the approximation sin a £* at is not
to exceed 1 per cent. Hint ( a — sin a) fa = 0.01 and sin a = a — (* 3 /3t) + (aVfit) -
8. Discuss the percentage error in the approximation (1 1-3) as x 0. How would
the percentage error behave if the term 0.00001 had not been added? (Tins shows that
it may be better not to alter the Taylor series even when such alteration reduces the
absolute error.)
Ste As in Example 1 of the text, obtain a polynomial approximation for the Fresnel
sine integral
dt which is valid within ±0.00001 for 0 < x < 3^.
POWER SERIES AND DIFFERENTIAL EQUATIONS
12. First-order Equations. One of the most important uses of power
series is in the solution of differential equations. P'or example, to solve the
equation y f — y assume that
y = a 0 + a { x + a 2 x 2 + a 3 x 3 H f a n x n H .
Then, according to Theorem II, Sec. 8,
y' = ai + 2 a 2 x + 3 a 3 x 2 + ia 4 x 3 H b (?i + l)a n+1 z n d .
Since y f = y, Theorem III of Sec. 8 shows that the series for y* and the
series for y must have the same coefficients. Thus,
£Zj = dp, 2d 2 ~ 3ci3 ~ d 2 , • » • i (ft d~ ^ . . ..
Starting with a 0 = c, a constant, we solve for aj, a 2 , ... in succession to
obtain
ai-c, a 2 = -. a 3 = ^’ •••» a " = nl'
154 INFINITE SERIES [CHAP. 2
and hence
CX 2 CX 3 €X n
„. c + car + _ + _ + .
This discussion is tentative only, since at first we had no assurance that
the equation y f = y possesses a solution expressible as a power series.
However, the ratio test shows that the series obtained converges for all x.
Hence by Theorem II of Sec. 8 the term-by-term differentiation is justified
and the equation y ' « y has actually been solved.
As another illustration, we obtain a power-series solution for the differ-
ential equation
(1 + x)y' * py, p « const, (12-1)
such that y « 1 when x — 0. If y « 2a n x n , then 1
y’ ** ai + 2a 2 x + 3a 3 x 2 -| f (n + l)a n+ |X n H
xy' * aix + 202X 2 H h na„x n H
py « poo + paxx + pa 2 x 3 H b pa n x n H
and the substitution in (12-1) yields
a\ + (2 a 2 + ai)x + (3a 3 + 2a 2 )x 2 + • • • + [(w + l)a n+ i + na n ]x n + • * •
« pa 0 + P^i* + pa 2 x 2 H b pa n x n H .
Equating coefficients of like powers of x gives the set of equations
a \ ~ poo
2 02 + 01 = P«i
3a 3 + 2a 2 = pa 2
(n + 1 )a n +i + na n * pa n
which must be solved for the as.
Since y « 1 at x = 0, we must have a 0 = 1. Then we get, in succession,
p(p - 1) P(P — l)(p — 2)
ai *= p, a 2 — * a 3 = » ...
y 21 3!
so that the solution is
y - 1 -f px +
P(P ~ 1) _ 2
2!
* +•
p(p - l)(p - 2) . . . (p - » + 1)
H : x n + •
n!
(12-2)
Throughout Secs. 12 to 14 2 means 22- -4 brief review of this sigma notation is given
in Sec. 2.
POWER, SERIES AND DIFFERENTIAL EQUATIONS
155
SEC. 13]
Hence, if the differential equation (12-1) has a series solution, that solu-
tion must be (12-2). However, the ratio test shows that the series (12-2)
converges for \x\ <1. Hence for |x| <1 the term-by-term differentia-
tion was justified, and this shows without further discussion that (12-2) is
a solution for \x\ <1.
Equation (12-1) can be solved by elementary methods as follows. Sep-
arating variables, we get
dy dx
— = V »
V 1 + x
so that log y * p log (1 x) = log (1 + x) p when we recall that y * 1 at
x = 0. Hence
y » (l + z) p .
Comparing this solution with that found formerly 1 gives
(1 + x)P = 1 + £
(p - n + 1)
which the reader will recognize is the binomial theorem. Since no assump-
tion was made about p the result ts valid for all p provided \x\ <1.
PROBLEMS
1. Obtain power-series solutions of the following differential equations, whidi satisfy
y 1 at x “0*
y' =** 2y, xj « y -f x, y' -b y « 1.
2. Obtain the first three terms of a series solution y * 2a n x" for the problem y ' * 2 esy,
y 1 when x •» 0. From these three terms compute the curvature k =» y"\\ -J- (y') 2
at x =» 0, Is your value for curvature exact or only approximate?
3. By considering the equation y' ** (I — x 2 ) - ^ obtain a series expansion for sin” 1 x.
In particular, show that
6 * 2 + 2 3 *2* + 2-4 5-2* + 2*4~fi7-2 7 4 ’
13. Second-order Equations. Legendre Functions. To illustrate the
use of series for solving second-order differential equations consider the
equation y „ _ xyl + y _ Q (13-1)
If y = Sa n x n , then y f * 2na n x n ~ l and y" = 2n(n - l)a„x n ~ 2 and, hence,
y" =s 2a2 + 3 • 2a 3 x + * * * + (n + 2)(n + \)a n + 2 X n + * • *
— xy * ... na n x — * * *
y *» ao + a t x -f b &nZ n + * • • •
1 By Chap, 1 , Sec, 17, the problem has only one solution.
1S6
INFINITE SERIES
[COUP. 2
According to ( 13 - 1 ) the sum of these three power series is zero, and there-
fore, by Theorem III of Sec. 8, the coefficient of x n in this sum is zero for
each n :
2 a a + ao « 0, coefficient of x°,
3 -2a3 — a\ + cq = 0, coefficient of x l ,
(n + 2)(n + 1 )a n + 2 — na n + a n « 0, coefficient of x n .
Hence,
n+2 (n + l)(n + 2)
This recursion formula gives, in succession,
(13-2)
1
1
1!
t <N
I
II
" 21 001
1
1
~ 3^4 02 ~
--a°,
3
3
= — a 4 =
5-6
"Si 00 ’
®2n
1 - 3-5 ... ( 2 n - 3 )
(2n)I
n > 3 .
Similarly, 03 = 0 * cq = 0, 05 = 0 , a 7 — 0, and so on. Hence the solution is
/ 1 2 1 4 1-3 e i* 3 * 5 8 \ ,
y * do ( 1 x * x x ar — • • • ) + aiX
V 2 ! 4 ! 6! 8! /
with the two arbitrary constants oo and a x . There should be two constants
in the general solution because the equation is of the second order (Chap. 1,
Sec. 21).
The ratio of two successive nonzero terms of the infinite series satisfies
02***"
2 n - 1
(2n + l)(2n + 2)
1 **!
(13-3)
as is seen by using (13-2) with 2 n written in place of n. Since the limit of (13-3) is zero,
the series converges for all x. Hence the term-by-term differentiation is justified, and
we really do get a solution.
Upon choosing oq * 0, a% * 1 we see that a particular solution is
!/i « *
( 13 - 4 )
SBC. 13] POWER SERIES AND DIFFERENTIAL EQUATIONS 157
and the choice oo ** 1, ** 0 shows that another particular solution is
y%
1 -~x 2
2!
1*3*5
8!
(13-5)
According to Chap. 1, Sec. 21, two solutions y x and y 2 such as this are
linearly independent if the equation c x y x + c 2 y 2 25 0 can hold for constants
c x and c 2 only when c x = c 2 = 0. The solutions (13-4) and (13-5) are in-
dependent, since one is an odd function and the other is even. 1 Hence the
expression a Q y\ + a x y 2 is the general solution of (13-1) (Chap. 1, Sec, 21).
The independence of y x and y 2 can also be deduced from the following
theorem :
Theorem I. Let y x ~ Za n x n and y 2 — %b n .r n be power series with non-
zero radii of convergence, and suppose that y x & 0. If y x and y 2 are linearly
dependent , then there is a constant c such that b n = ca n for all values of n.
From ciSa*#” + <' 2 2h n x n » 0, it follows that
cia n + rd>n 3=3 0, n 0, 1, 2, . . .. (13-6)
If C 2 °* 0, then (13-6) gives a n * 0 for all n, contrary to hypothesis. Hence eg 9 * 0 t
and (13-6) gives b„ • (— c\/cl)a n ~ This is the required result.
Obviously not every differential equation has solutions that can be
represented by the power series. 2 The following theorem due to Fuchs,
which we state without proof, 3 gives sufficient conditions for the existence
of power-series solutions of second-order linear equations.
Theorem II. Let y" + fi(x)y f + f 2 (x)y - 0 have coefficients f x (x) and
f 2 (x) which can be expanded in convergent power series for \x\ < r. Th$n
every solution y can be expanded in a convergent power series for \x\ < r.
The solution series converges at least for | x | < r but may converge in a larger inter-
val. For example, consider the equation
(2 - x)y" + (x - W - V - 0. (13-7)
Writing (13-7) in the form
0
(so that the coefficient of y " is 1 as in Theorem II) we get
h(x)
x — 1 1 x — 1
2 - x “ 2 l-^i’
1 11
h(x) = - YZ~ " 21 -Hr'
1 A function f{x) is odd if /(— x) « —/(x), ewn if f(—x) ** f(x).
* Thus, xy f » 1 has a solution y log x which cannot be expanded in Maclaurin's
series.
8 A simple proof is given in H. T. H. Piaggio, “An Elementary Treatise on Differential
Equations and Their Applications ,’ * 2d ed., George Bell & Sons, Ltd., London, lf)2S.
INFINITE SERIES
158
[chap. 2
Since fi(x) and f%{x) have power-series expansions for |x| <2, Theorem II asserts that
the solution y also has a power-series expansion valid for |x| <2. Actually, the solu-
tion of (13-7) is
x n
y *■ -f- — x) «■ Ci2 — - -f- ca( 1 — z) t
nl
which converges for all x.
As another example we shall find the complete solution of Legendre's
equation
(1 — x 2 )y" - 2 xy' + p(p + 1 )y * 0,
03 - 8 )
where p is a constant.
By Theorem II this equation has a power-series solution valid at least
for | x | < 1 . Assuming y = Sa n x n , we get
y n = Sa n+2 (w + 2) (ft + l)x n
—x z y" » 2 — a»n(n — l)z n
—2xy' = 2 — 2na u x n
p(p + 1)2/ = Sa n p(p + l)x n .
By (13-8) the sum of these series is zero. Considering the coefficient of
yields
<*n+ 2 (n + 2 )(n + 1) » a n [n(n -f 1) - p(p + 1)] (13-9)
after slight simplification. For all n > 0 we have
(P - n)(p + n + 1)
a„ +2 - — a n - — — ~r ’ (13-10)
(n -f l)(n + 2)
after factoring the bracket in (13-9) and dividing by (n + 2)(n + 1). The
coefficients for even n are determined from a 0} and those for odd n from a\.
Computing the coefficients successively, we get the final result
a 0
1 -
p(p + 1)
2!
* 2 +
p(p - 2)(p j- l)(p + 3)
4!
+
ai
(p-I)(p + 2) (p — l)(p — 3)(p + 2)(p + 4)
3!
■x* +
5!
]■
Theorem II guarantees that the general solution can be obtained in this fashion, and
indeed, by Theorem I, the coefficients of do and ai in the foregoing expressions are in-
dependent. Equation (13-9) shows that the series converge for |x| <1 when we apply
the ratio test to the ratio of successive nonzero terms. When p is a positive even integer,
however, the expression involving ao reduces to a polynomial, hence converges for all x.
If Co is so chosen that the polynomial has the value unity when x =* 1, we get the sequence
PM
PM
1,
3 2 1
2 2
7*5 4 o 5 ' 3 2
— x 4 — 2 — X 2
4-2 4-2
+
3*2
iY
POWER SERIES AND DIFFERENTIAL EQUATIONS
SEC. 14]
159
and so on. Similarly, when p is a positive odd integer, the coefficient of a\ terminates
and we get a sequence of which the first three terms are
Pi(z) « x,
n , x 5 - 3
P z (x) - - X Z - - X,
P*(x)
2
9*7
2
_ 2 — X 3
4-2 4*2
5*3
4 * 2 ;
These polynomials are known as Legendre polynomials; they arise in several branches of
applied mathematics.
PROBLEMS
1. Solve by means of power series:
v" + y “ o, y" -f y 1, ?y" - xy « 1.
2. For Kq. (13-7), show that y * 2)a n x n leads to
2(?i -f- 2)(/i 4* l)a n j 2 ™ (n 4 l)4i„ fj 4- (« — l)a« ** 0.
(a) By taking a« « » 1, find a solution satisfying y = y f ® 1 at x » 0. (6) By an-
other choice of ao and a\ find a solution satisfying y ** — ], y' = I at x * 0.
3. Solve by means of power senes if y( 0) — y'{ 0):
(x* - 3x + 2)y" + (x 2 -2x- W + (x - 3 )y - 0.
4. Solve y" — (x — 2)y =0 by assuming y ** £a«(x — 2) n . Also obtain the first
three terms of the solution m the form y = Za n i n
5. It can be shown 1 that
(1-2 xh -f h 2 r H - PoCri + Piix)h 4- Pz(x)h 2 4 b P w (r)A« + • • • .
V(‘rify thus equality through the terms in h b Hint Expand [1 — (2hr — by the
binomial theorem, and collect powers of h The function (1 — 2xh 4" /r)“ Va is calfed
the generating function of t lie sequence P n {x ).
6. Verify Rodrigues' formula*
e.(*) - ~~ - D B
for n «• 0, 1, 2, 3.
2"n l dx n
14. Generalized Power Series. Bessel’s Equation. An important dif-
ferential equation was encountered by the German astronomer and mathe-
matician F. W. Bessel in a study of planetary motion. The so-called Bessel
functions which arise from the solution of this equation are indispensable
in the study of vibration of chains, propagation of electric currents in
cylindrical conductors, heat flow in cylinders, vibration of circular mem-
branes, and many other problems of applied mathematics.
Bessel’s equation is
x 2 y" + xy' + ( x 2 - p 2 )?/ = 0, (14-1)
1 E. J. Whittaker and G. N. Watson, 14 Modern Analysis,” pp. 302-303. Cambridge
University Press, London, 1952.
*Ibid.
INFINITE SERIES
160
[chap. 2
where p is a constant. Theorem II of Sec. 13 does not apply to this equa-
tion, since
1 p 2
/iW - - /*(*) » 1
x ar
and these functions cannot be expanded in power series near x — 0. For
this reason we do not expect a power-series solution y = 2a n x n . It was
shown by Fuchs, however, that a wide class of equations, including (14-1),
have solutions of the form 1
00
V = x p 2a„x n - £ a»x n+p , a 0 * 0 (14-2)
where p is constant. The theorem of Fuchs reads as follows:
Theorem I. Let xfi (x) and x 2 f 2 (x) have power-series expansions valid for
| x j < r. Then the equation
y" + fi(x)y’ + f 2 {x)y « 0
has a solution of form (14-2), also valid for \x\ < r.
Since Bessel's equation gives
xfi(x) = 1, x 2 / 2 (x) = x 2 - p 2 ,
Theorem I asserts that the series (14-2), when found, will be valid for all x.
To obtain this series, note that
X 2 L ««* n+p = i a„x n+p+2 - £ a n _ 2 x” +p , (14-3)
n = e »0 n «=0 n *»2
as is seen by writing out in full. The limits (2 ,«d) on the latter summation
may be changed to (0,«>) if we agree to define
= 0 for all negative n. (14-4)
Hence
x 2 y" = 2a n (n + p)(n + p - l)x n4 " p
£ 2 /' = Xa n (n -f p)x n+p
x 2 y = Sa n _ 2 ^ n4 * p
—p 2 2/ * S — p 2 a n x n4p .
According to Eq. (14-1), which we wish to solve, the sum of the four terms
on the left of the above equations is zero. Hence, the same is true for the
series on the right. Equating to zero the coefficient of x n+p in the sum of
these series gives
a n (n + p)(n + p — 1) + a„(n + p) + a n _ 2 — p 2 a* * 0
1 The novelty is that we allow p to be any number, whereas if (14-2) is an ordinary
power series, p must be an integer. Since p may be increased at will, the assumption
oo pt 0 involves no loss of generality.
161
SBC. 14] POWER SERIES AND DIFFERENTIAL EQUATIONS
or, after simplification,
On[(n -f- p) 2 — p 2 ] + a n _ 2 = 0. (14-6)
Equation (14-5) is valid for all n. For negative n Eq. (14-5) holds auto-
matically by virtue of (14-4). The first nontrivial case of (14-5) is called
the indidal equation; it is obtained in the present example by putting n * 0
and takes the form
ao(p 2 — p 2 ) + 0 « 0, indicial equation, (14-6)
This shows that p = p or p = — p, since a 0 5^ 0. The other values of a n
are determined from (14-5) in the form
On
1
(n + p) 2
2*
04-7)
The choice n « 1 gives 1 a\ = 0, and hence a n - 0 for all odd n. Also
a 2 **
a A
Oq
(2 + p ) 2
a 2
&0
(4 + p) 2 - p 2 [(4 + p) 2 - p 2 ][(2 + p) 2 - p 2 ]
and so on. In this way it is easily verified that the series corresponding to
P = P is
V * aoX 1 '
1
+
2(2p + 2) 2-4(2p + 2)(2p + 4) J
and that the series for p = — p is the same, with ~p in place of p.
When p is a nonnegative integer, the expression may be simplified by use
of factorials, as follows. We take a factor 2 from each term of the denom-
inator and place it with the x in the numerator, obtaining
y = a 0 x p 1
(x/2) 2
+
(x/2) 4
l(p+l) l-2(p+ l)(p + 2)
If the denominators are now multiplied by p\, there results
f 1 (r/2) 2 (x/2) 4
y aoP ' x [pi i-( p+ l)! + 2Kp + 2)l
and since x p - 2 p (x/2) p , this yields y = a 0 p\2 p J p (x), where
(14-8)
w - Z
(-\) n (x/2) Sn+p
n-o n!(p + n)l
1 Provided the denominator in (14-7) does not vanish. See Prob. 7.
(14-9)
INFINITE SERIES
162
[chap. 2
The function J p (x) is called the Bessel function of order p. The graphs of
Jo(x) and J\(x) are shown in Fig. 7.
Now, the differential equation (14-1) is meaningful even when p is not a
positive integer, and the series solution (before introduction of factorials)
is also well defined for general p. It is natural to inquire if we can define p!
in such a way that (14-9) is meaningful and satisfies (14-1) for p unre-
stricted. A glance at (14-8) shows that such an extension may not be pos-
sible when p is a negative integer, but there appears to be no difficulty
otherwise.
To obtain an appropriate definition of p\ for arbitrary p, we introduce
the so-called gamma f unction V(p) defined by
F(p + 1) = / t p e-*dt, p > 0. (14-10)
Jo
This function was discovered by the celebrated Swiss mathematician L
Euler. Because of its connection with />’ the function F(p -f- 1) is often
called th a factorial function and is written p ! or ll(p). We shall use the
notation pi as soon as we have established that (14-10) gives the appro-
priate value when p is an integer. For comparison with other notations,
p! ss Il(p) =* V(p ~f 1).
If p > t, integration by parts 1 in (14-10) gives
lOO |00
I = -fi V I dt.
Jo 10 Jo
Since the integrated term drops out, the foregoing relation simplifies to
r (p + l) » pT(p) (14-11)
when we use (14-10).
Writing (14-11) in the form
Tip) ^p-’lXp-M) (14-12)
1 Since the improper integrals (Sec. 3) are convergent, the process is justified by writ-
ing b in place of <*>, carrying out the partial integration, and then letting b -♦ ». Actu-
ally (14-10) holds for p > — 1, and the partial integration is valid for p > 0.
POWER SERIES AND DIFFERENTIAL EQUATIONS
163
sec. 14]
enables us to define F(p) for negative values 1 of p. Thus, if any number p between
0 and 1 is used on the left side of (14-12) then the right side gives the value of F(p),
for when p > 0, F(p + 1) is determined by (14-10). If the recursion formula (14-12) is
used again, the values for — 1 < p < 0 can be found from those for 0 < p < 1. That
is, p -f* 1 in (14-12) ranges over the interval (0,1) if p ranges over the interval ( — 1,0).
Similarly, when we know F(p) for —1 < p < 0, we can find F(p) for —2 < p < —1,
and so on.
Inasmuch as (14-10) gives
r(l) * | i t Q e,~ t dt « -e-'j - 1, (14-13)
Jo lo
the method fails for p*0. Thus
lim F(p) « lim p“ l F(p -f 1) «- -f « or — °o
V - 0 P -*■ 0
according as p — ► 0 through positive values or through negative values. Similar be-
havior is found for all negative integers, and hence the
graph of r(p) has the appearance shown in Fig. 8.
However, by use of (14-10) and (14-1 i) it is easily
verified that F(p) never vanishes, and hence, if we
agree that 1/F(p) « 0 for p a negative integer, it wall
follow that the function 1/F(p) is well behaved for
every value of p without exception 2
Equations (14-11) and (14-13) give F(2) » T(l) ~ 1
and, in succession,
r(3) - 2 r( 2 ) * 2-1
r(4) * 3l’(3) - 3-2 1
T(n + 1) » nl\n) - u(u - 1) ... 3-2-1.
Hence, the definition
pf * T(p + 1) (14-14)
furnishes the desired generalization; it gives a meaning to p ? for all p except p ** — 1,
—2, —3, . . . , and it gives the familiar value when p is a positive integer. The properties
p! *= p(p — 1)! or p ~ « (p 'J . " jjl (14-15)
are ensured by (14-11). The former fails when p is zero or a negative integer, but the
latter holds for all p, without exception
The result of this discussion is that 1/p! is well defined everywhere, and
p! is well defined except for p — —1, —2, —3, . . at which values 1/p!
vanishes. Moreover, we have the fundamental formula (14-15). When a
series containing factorials is obtained by solving a differential equation, it
is almost always this relation (14-15) that makes the series a valid solution,
1 Equation (14-10) does not serve this purpose, because the behavior of t p at l m 0
makes the integral diverge when p < —1.
* The relation of F(p) and l/r(p) is quite analogous to the relation of esc p and sin p,
respectively.
INFI mr% SERIES
164
[chap. 2
and hence the extended definition of pi may be used without hesitation.
In particular, J P (x) and J-. P (z) are solutions of (14-1) no matter what
value p may have.
For most values of p the functions J p {x) and J~ p (x) are independent, by
Theorem I, Sec. 13, and the general solution of Bessel’s equation is
y = C\J P (x) + C 2 J- p (x ), Ci and c 2 const.
If p « 0, however, then the two roots of the indicial equation are both
p 0, so that we obtain only the single function Jo(x). Another excep-
tional case arises when p = ±1, db2, .... Although the series (14-8) is
meaningless when p is a negative integer, the series (14-9) is well defined
and satisfies
J~m(x) - (-1 ) m J n (x), m ~ 0, 1, 2, 3,
To see this, we observe that
A ( — l) w (r/2) 2w ~~ yw = • (— l ) n (x/2) 2w ~~ m
^ X n^To n\(n-m)\ ", » !(n - m) !
(14-16)
(14-17)
since the factor l/(n — m) \ is zero when n < m. If the sums (14-9) and
(14-17) are written in full, then (14-16) follows at once. 1 Because of (14-16)
the functions J r „ m (x) and J m (x) are dependent, so that we obtain only one
solution rather than two.
This failure of the method to provide both solutions is not a serious
shortcoming, since the second solution can always be found from the first
by the method of Chap. 1, Sec. 29. Carrying out the calculation in the
general ease yields the following theorem: 2
Theorem IT. When the root? p { and p 2 of the indicial equation are distinct
and do not differ by an integer , the nuthod of Theorem 1 yields two linearly
independent solutions » If the roots do differ by an integer a second solution
can be found, by assuming that
CO
2/2 = cyi(x) log t +■ £ a n x n+flt (14-18)
»-*()
where yi{x) is the solution given by Theorem I for the root p = p\.
By setting i/i(.r) «
equation for p = 0 is
Kq(x) *
«A»lr), for example, one can show that a second solution of Bessel’s
(-DV/2) 2 * .
* Jo(x) log*-S
{W
(■>-()
1 It is suggested that the reader verify this statement by actually writing the sums
in full
* It was shown by Frobenius that the second solution can also be obtained by differen-
tiating the first solution with respect to the exponent p; cf. Piaggio’s book cited in
Sec* 13.
SBC. 14] POWER SERIES AND DIFFERENTIAL EQUATIONS 165
This function is called the Bessel function of the zeroth order and second kind. Thus,
the general solution of
xy " -f- / 4 xy * 0
is y m Cj/ 0(2) 4 c%Ko(x). Other functions K m (x) of the second kind are obtained simi-
larly. By considering linear combinations of J tn (x) and K m {x) we get the modified Bessel
functions of the first and second kinds, denoted in the literature by I m (x ), Y m (x), N m (x),
and H m (x),
PROBLEMS
1. Show that Jq(x) * —J\{x) and also that
— X n Jn(x) * zVn- l(r), ~ X~V*(:r) ** ~ x ~*Jn+ 1&).
Deduce that
- Jn+xU) - 2J&r),
2 n
J n-l(r) 4 J n 4 l(x) «* — J „(x).
3 *
2 . The confluent hyper geometric equation is
xy" 4 p/ « xy' 4 qy,
where p and 9 are constant.
(«) According to Theorem I, what range of validity do you expect for a solution of the
form 2a»x n+p ?
(6) Assuming that 1/ ~ ' 2 a n x n '* p , verify that xy" * 2 a„_n(n 4 p 4 l)(n 4 pk’^and
find similar expressions for py', ry\ and qy
(c) By considering the coefficient of j n+/> , deduce
On m( w 4 p 4 1 )(n 4 p 4 p) 3 o n (n 4 p 4 ?)
for all values of n.
3 . In Prob, 2, (a) show that the roots of the indicial equation are p « 0 and p ** 1 — p.
Hint 4 The first nontrivial case arises when n =* - 1.
(6) When p =* 0, show that
n 4 q
a„4-i a „ ^ j)( n p )
if p is not zero or a negative integer. Thus get the solution
r . o (g + n J* g(? + 1)(<? + 2) i 3 -j
Vl - 00 p + p j, + - (p + , )2 , + p( p '+ i)( p + 2) 3' + ” J
(c) Similarly, obtain the solution corresponding to p * 1 — p when p is not a positive
integer.
4 . For the hypergeo metric equation of Gauss,
r(l ~ x)y" 4 [e — (a 4 & 4 Ik]?/ - ahy « 0,
obtain one solution in the form
* r(o 4 n) r (b 4 n)
y ntTo r(l 4 n) V(c 4 n)
when a and b are not negative integers.
166
$. For the equation
INFINITE SERIES
[CHAP. 2
xy" + (1 - 2 p)y' +xy ~ 0
obtain a recursion formula for the coefficients of a series solution, and show that one
solution is y » x p J p (x).
6* As in Prob. 6, show that y » x h J p (\x) satisfies
4 x 2 y" + (4X 2 jc 2 — 4 p 2 -f l)y = 0.
7* (a) Given that r(k£) «* V*, obtain the formulas
Jm(x)
(b) What is the general solution of Bessel's equation with p ** H? (This shows that
Theorem I may yield the general solution even if w ~~ pi is an integer.)
8 . The generating function of the sequence J n (x) is
e 2 ^ ~ h) „ £ J n (x)h n .
n=>— oo
Verify that the coefficient of h° in the expansion of the exponential is, in fact, ./oO).
Hint: By the series for e u the exponential is
=©‘(‘- 0 ‘=-
Pick out the term independent of h in the binomial expansion of [A — (1/^)]'* when n
is even, and note that there is no such term when n is odd.
SERIES WITH COMPLEX TERMS
16. Complex Numbers. The equation x 2 + 1 = 0 cannot he solved by
means of real numbers because the rule of signs does not allow the square
of a real number to be negative. But if one adjoins a symbol i to the real
numbers, which satisfies the equation
i 2 =~l (15-1)
by definition, then one can construct the so-called complex numbers a + In
The latter satisfy the algebraic laws obeyed by real numbers, and they in-
clude the real numbers as a special case. Moreover, complex numbers en-
able us to solve not only the equation x 2 + 1 = 0 but every polynomial
equation.
Since we want to keep the familiar laws of algebra, it is easy to see how
addition and multiplication of complex numbers ought to be defined. In-
deed, 1 if a, b , and i in a + ib are to be treated like any other numbers of
elementary algebra, then
(a + ib) + (c + id) = (a + c) + i(b + d). (15-2)
1 See also Chap. 7, which contains a discussion of complex numbers and functions from
a somewhat different point of view.
SEC. 15] SERIES WITH COMPLEX TERMS 167
This equation is now taken as the definition of addition. In the same way
we are led to define multiplication by
(a + ib)(c + id) = ( ac — bd) + i(bc + ad), (15-3)
since elementary algebra would give Ihe product
ac + ibc + iad + i 2 bd = ac + i 2 bd + i(bc + ad),
and (15-1) asserts that i 2 = —1. Finally, we agree that a + ib = a + hi.
It is easy to verify that these definitions (15-2) and (15-3) do preserve the familiar rules
of algebra (including those rules that were not considered iu fi anung the definitions) For
example, complex numbers Zk satisfy
Z\ 4* 2*2 *** 22 4“ *1, (%l 4* 4“ 23 ~ 2l + (?2 4" Zj),
Z1Z2 * 22^1, (ZlZ^Z'J « 21(2223),
21(22 *F 23) = 2|2'2 4 - 2iZ 3 .
Also there is a zero, and there is a unit:
z 4 (0 4 - Oi) ** 2, 2(1 4 - Or) «= z for all z.
Moreover, the complex number a 4- 0i is found to be equivalent in every respect, except
notation, to the real number a Hence in this sense the complex numbers contain the
reals as a special cast*, and we have a right to consider that
a ~b Oi - a. (15-1)
The convention (15-4) also agrees with our purpose of Keeping the rules of algebra intact.
Using (15-4) we write 0 and 1 for the zero and unit element of our algebra. Subtraction
is defined by considering th»* equation (a 4- ib) -f 2 - 0; it will be found that
-(a-f bi) * (~a) -4 (—b)i ** (~l)(a 4 - tb).
Although we made no attempt to preserve the euricelhition law, it is nevertheless valid;
that is,
Z 1 Z 2 = 0 only if zi =* 0 01 22 ** 0.
And finally, the possibility of division is suggested by
a 4- ib (a 4 tb)(c — id)
c 4 id (r 4- id)(c — id)
ac 4- bd be — ad
= CHM* + ’ c 2 T?'
Now, the latter expression can be shown by (15-3) to satisfy the equation
(c 4- id)z « a 4- ib.
Hence, the result of this heuristic calculation is, in fact, the quotient. Thu process breaks
down if c 4* id «* 0, but only then.
The general tenor of this discussion is that the algebra of complex num-
bers agrees with the algebra of real numbers and we need not hesitate to
168
INTINITE SERIES
[CHAP. 2
apply the familiar rules to the new symbols. There is one new feature,
however. When we say that a ib is a symbol for a complex number, we
assume, naturally, that different symbols represent different numbers. In
other words, if
a + ib * a' + ib' for a, a', b , b' real, then a * a' and b *= b'. (15-5)
This important relation may be taken as the definition of equality. Un-
like the algebraic properties described hitherto, (15-5) is true for complex
numbers only; it does not hold if i is replaced by a real number.
The following alternative analysis of (15-5) shows the role of the equation i 2 «* — 1
and also shows why a, a b , b' must be assumed real. If a -f ib «* a' -f ib\ then
a ~~ a' — i(b f — b).
Squaring gives (a — o') 2 ** —(6 — b') 2 because i 2 « —1, and hence
(a - a') 2 + (b - b') 2 - 0.
Since the square of a real number is positive unless the number is zero, the latter equa-
tion implies a — a' ** 0 and b — b' = 0.
Complex numbers z = x + iy may be represented graphically in the so-
called z plane by introducing two perpendicular axes, one for x and one for
y (Fig. 9). The x axis is called the real axis , and x is the real part of x + iy;
the iy axis is the imaginary axis, and iy is the imaginary part of x -f iy.
The absolute value of z is the distance from the representative point to the
origin; it is denoted by \z\, as in the case of real numbers. Evidently, the
points satisfying \z\ = r lie on a circle of radius r centered at the origin.
The interior of this circle consists of the points \z\ < r. When z = x + iy,
then
|*| - V? + 'y*. (15-6)
A short calculation based on (15-6) gives
Z\ \Z\\
I *1*2 1 = |*1 1 |*2l, — =7—7 if*2^0,
z 2 i z 2 I
so that the absolute value of a product is the product of the absolute values,
and similarly for quotients.
Since real and imaginary parts are added separately, computation of
zt + z 2 can be effected as shown 1 in Fig. 9. Inasmuch as the sum of two
sides of any triangle is greater than the third, the figure gives the important
inequality
|*i + **l < l*il + l**|. (15-7)
1 One may think of z * x + iy as a vector with components x and y. The method of
adding vectors by adding components agrees with the definition (15-2), and hence the
construction of Fig. 9 is simply the parallelogram rule familiar from mechanics.
SRC, 16} SERIES WITH COMPLEX TERMS 169
A similar result may be obtained from this one for any number of complex
numbers.
If s n * u n + iv n and s « u + iv, we define lim s n =» s to mean that
simultaneously
lim u n « u, lira v n *= v. (15-8)
This shows that the theory of limits for complex numbers can be based on
the corresponding theory for real numbers.
PROBLEMS
1. Show that | Z 1 Z 2 i « jzi | 1 22 !.
2. Show that hm s n * & if, and only if, lim — aj =0.
3. Sketch the set of points z in the complex plane described by (a) | z | « 1, (b) \z\ <2,
(r) \ z \ > 1, (d) 1 2 — 2 1 <1. Hint: j z — a | is the distance from the representative
point for z to that for a.
16* Complex Series. Convergence of infinite series of complex numbers
is defined by considering the limit of partial sums, just as for real series.
By (15-8) the complex series converges if, and only if, the two real series
obtained by considering real and imaginary parts are both convergent.
In other words,
X(p n + iq n ) - V + n
if, and only if, Xp n = p and X q H - </. Because of this correspondence of
real and complex series, most of the results presented hitherto in this chap-
ter apply with little change to the complex case, and the proof also involves
nothing new.
As an illustration let us show that the general term of a convergent senes approaches
zero . If a n is complex, we have
n n~~ 1
a n ** £ ~ 23 a k>
ft- 1 *-l
and taking the limit as n -> « yields lim a n » 0, exactly as in the proof for real series.
Alternatively, one may use the result for real series Namely, if a* » p* -f then the
INFINITE SEMES
170
[chap, 2
convergence of 2a* implies the convergence of 2p* and of 2<?*. Hence, by the theorem for
real series, lim p n « 0 and lim g„ ■» 0. Consequently, lim a n — 0.
As a second illustration, we show that an absolutely convergent senes is convergent.
With a* ** pa + iqic, we have
|P*I “ Vpj < Vp| + r/l - |o*|.
Hence, if 2 1 a* | converges, then 2 1 p* | converges by the comparison test for real series,
and then 2p* converges, because we know that for real series absolute convergence im-
plies convergence. Similarly, 2 <ik converges, and hence 2 (pa ~r ag) converges.
As a third example, the leader may obtain the analogue of Theorem I, See S, foi com-
plex series. That is, if 2 a n z n converges for z — zo, then the seiies converges absolutely
for all z such that \z\ < (zo I and uniformly for all z such that \z\ < \z\\ < |*n| It will
be found that the proof is the same, word for w ord, as the proof in the ease of real senes
The symbol for absolute value, however, has the meaning assigned in (15-6).
For many series 2 u n (z), the set of points z at which the series converges
gives a complicated region in the z plane. It is a remarkable fact that for
a power series the region of convergence is always a circle centered at the
origin. The circle is called the circle of convergence, and the radius of the
circle is the radius of convergence. We agree to take the radius as zero if
the series converges for z — 0 only and as infinity if the series converge*
for all z. At points on the boundary of the circle the series may oil her
converge or diverge, just as in the ease of the interval of convergence for
real series.
For proof that the region is a circle, let 2a n z n be a power series which
converges for some value z = Zo ^ 0 but diverges for some other value
z «= z\. As we have already noted, the fact that the series converges for
z = z 0 makes the series converge throughout the circle \z\ < j z {) j On
the other hand, the series obviously does not converge throughout, any
circle containing the point z Y (see Fig 10). We let C be flic largest circle | z |
= r such that the series converges at every interior point of C. The radius
r of C is at least equal to \z 0 \ but does not exceed \z x \.
To show that C is the circle of convergence, all we have to do is establish
that the series diverges at every point exterior to C. Let z 2 be an exterior
point, so that \z 2 \ > r. If the series converges at z 2f then it would have to
converge throughout the circle \z\ < \z 2 \. But this contradicts the fact
that C is the largest circle throughout which the series converges, and hence
the proof is complete. 1
To illustrate the concept of circle of convergence consider the series
— — 5 = 1 - x 2 + x* H h (-l)V n H
1 + x l
1 The fact that a largest circle exists is quite clear from the geometry. An analytical
proof may be given, if desired, by constructing circles with successively larger radii
and using the fundamental principle (Sec. 1).
SEC. 17] SERIES WITH COMPLEX TERMS 171
which converges for \x\ <1 but diverges at x = dbl. If 1/(1 + x 2 ) is
regarded as a function of the real variable x , there appears to be no reason
why the series should diverge when \x\ >1, for 1/(1 -f x 2 ) has derivatives
of all orders at every value of x. But when we regard x as a complex vari-
able, the divergence is explained by the fact that the denominator l + x 2
vanishes at x = dri. Clearly, if the circle of convergence cannot contain
the points dti, then the radius of convergence cannot exceed 1.
Fig 10
PROBLEMS
00
1. Verify that the series £ z n frv eon verges absolutely at every boundary point of
1
00
its circle of convergence whereas 7i 2 z n converges at no boundary point of its circle
1
of convergence
2. If /(*) - and /( 0) - 0,
(а) How does f(z) behave when z «= x and x 0 through real values?
(б) How does f{z) behave when z * ly and t/ — ► 0 through real values?
(c) Could this function have a power-series expansion valid m some circle contain-
ing the origin? Explain.
17. Applications. By means of power series the functions sin e x , log x,
tan^ 1 x, J p (x), and so on, may be extended to a complex variable z. For
example, since the series
/j*2 £.3
— log (1 ~ x) x + “• + —- H 1 1
2 3 n
172 INFINITE SERIES [CHAP. 2
converges for Jx| < 1, we know without further discussion that
z 2 z 3 z n
z + “ + -*• H 1 1
2 3 7i
converges for |zj <1. The latter series is the definition of — log (1 — z).
Similarly e* y sin z t and cos z are defined by
e m »
sin z =
(2n+ l)T
cos z =
2
(~I)V W
1^)7“
Many familiar formulas can be extended at once to the complex-variable
case. To establish that
sin 2 z + cos 2 z ~ l t
(17-1)
for example, we reason as follows. It is known that (17-1) holds when z is a
real variable x, and hence
(-l) n x 2n+1 1
2
, (-I)"* 2 "]
2-
+
V x
(2 n -f* 1) ! _
L (2 n) ! J
(17-2)
when x is real. The left side of (17-2) is a power series, as we see by imagin-
ing that the terms are collected. Since the power series is zero for an inter-
val of x values, every coefficient must be zero (Sec. 8, Theorem III). Hence
the power series is also zero when x is replaced by a complex variable z, and
this establishes (17-1).
The same method may be used to prove formulas involving two variables; for example,
from
e xi « e*'e r * for X\ and x% real,
it follows without detailed calculation that
e*i+*2 «, e *i e *2 f or Zl am j Zj complex.
The systematic development of this idea leads to a branch of analysis known as the
theory of analytic continuation.
Upon letting z * ix in the series for e g , we get
(• ix) n - 2 ~ 4 ~ 6 ' ~ 3
x .„, x xr x w
sss 2J ass 1 — 1~ — — f~ *
nl 2! 4! 01
.(*
when we write out the sum in full, noting that
■h
♦ * -t,
1,
x x
1
3! 51
n =*= t f
7 !
+ *
6UC. 17] SERIES WITH COMPLEX TERMS 173
The series representations for cos x and sin x now give Euler’s formula,
e % * = cos x + i sin x , (17-3)
which expresses the exponential function in terms of the trigonometric
functions. On the other hand (17-3) also leads to
e ix + e -ix e ix _
cos x = , s j n x « : , (17-4)
as the reader can verify. These equations are constantly used in the study
of periodic phenomena, for example, in network analysis and synthesis, in
physical optics, and in electromagnetic theory. The calculations are ordi-
narily carried out by complex exponentials, and then the appropriate
trigonometric form is obtained by taking real or imaginary parts [cf. (17-3)].
Example 1. Obtain the trigonometric identity
cos u -f cos 2u H h cos nv =
sin ( n 4- 4£)«
2 sin Yu
for u 9 * 0, ±2*-, ”fc4ir, . . . , by considering the exponential sum
8 - <? 1U + e 2lu + f e n ' u .
The series 8 is a geometric series with ratio r * c*“, and hence
1
(17-5)
(17-6)
by (1-18). If the numerator and denominator in (17-6) are multiplied by e we get
ptnu _ ! ^ u — e t ^ u
_ e ~thv 2? sin l /iu
upon using (17-1). By (17-3) this yields another expression for the exponential sum s,
cos (n 4" Y)a 4* f sin (n -f Mi) a — cos \<iu — x sin Yu
2i sin Y^u
which leads to (17-5) when wo equate real parts.
Example 2. Show that / e %kx dx « 0 if k is a nonzero integer, and deduce
i:
/:
• cos mx dx
cos nx sin mx dx *= 0
[** . . . JO,
J sin nx sin mxdz »
if m n,
if m »* n,
if m n,
if m *» n,
(17-7)
whenever m and n are positive integers.
[chap. 2
174
INFINITE SERIES
If k is a nonxero integer, (17-3) gives
r n f 2v
J r* t 2
F e 4 ** dx ** I (cos kx + i sin kx) dx * 0, (17-8)
o J o
which is the first result. By (17-4),
r**
i c
Jo
•A
4 / cos nx cos mz dx » / (e tnx -j- c + <T~ tm *) dx
j j- g *(»«4n)* c »(m— n)« _j_ e t(»-m)a: _|_ € ~ t(*»-f«)xj fa
Jo
Each term is of the form e^ 1 with k an integer, and hence we get zero unless m * n.
If m =* n, the two middle terms give 2, so that the integral is 4ir. The other relations
(17-7) are established similarly.
PROBLEMS
1. By using the series definition for e z , show that ( d/dx)e cr « ce cx when c is a complex
constant. (Since we have not defined the derivative with respect to a complex variable,
assume that x is real.)
2. Sum the series sin x + sin 2x H — ■ + sin nx.
. Evaluate j e ax cos bx dx and j e ax sin bx dx by considering yV (o+t6)l dx and equating
e (a+ib)z € <** e *x( a _ ib )
real and imaginary parts. Hint:
,2T a + lb
r ■ ° 2 + 62
4. Evaluate / (2 cos x) 4 dx by using the formula
Jo
2 cos x =* e xx -}■
together with (17-8).
6. Show* that every complex number z may be written in the form z *» re 1 *, where
r > 0 and 0 < 9 < 2r. Bird : If z « x + ty, introduce polar coordinates ( r,d ), so that
x » r cos 0 and y =» r sin 9.
6. (o) For 0 < r < 1 obtain the expansion
-i *= Sr n (cos n9 + i sin nd)
J — re* 0
by letting z *> re* in the series for 1/(1 — z).
(6) Separate 1/(1 — re*) into real and imaginary parts, by noting that
1 — re ^ 1 1 — r cos 9 ir sin 0
1 — re~~ * 1 — re* 1—2 r cos 0 4* r*
(c) From (a) and (6) deduce that
1 — r cos 9
1 — 2r cos 9 -{- r*
r sin 9
1 — 2r cos 9 -h r*
* 53 r " cos
n»0
oo
« 53 r n sin ntf,
n =— 0
0 < r < 1,
0 < r < 1.
[The first series of (c) is an example of a Fourier cosine series , and the second is a Fourier
sme series. The study of such series by real-variable methods forms the topic of the next
eight sections.]
SBC. 18]
FOURIER SERIES
175
FOURIER SERIES
18. The Euler-Fourier Formulas. Trigonometric series of the form
00
/« ~ o + 23 ( a n cos nx -f & n sin nx)
n«l
(18-1)
are required in the treatment of many physical problems, for example, in
the theory of sound, heat conduction, electromagnetic waves, electric cir-
cuits, and mechanical vibrations. An important advantage of the series
(18-1) is that it can represent discontinuous functions, whereas a Taylor
series represents only functions that have derivatives of all orders.
We take the point of view that /(x) in (18-1 ) is known on ( — tt, it) and
that the coefficients a n and b n are to be found. In order to determine a 0 ,
we integrate (18-1) term by term from — x to tt. Since
/:
cos nx dx
r
the calculation yields
s f sin nx dx — 0 for n = 1,2, . .
j — *
J f(x) dx * a 0 T.
(18-2)
The coefficient a n is determined similarly. Thus, if we multiply (18-1) by
cos nx, there results
/(x) cos nx » }4ao cos nx 4 h a n cos 2 nx H , (18-3)
where the terms not written involve products of the form sin mx cos nx or
of the form cos mx cos nx with m 9 ^ n . It is easily verified 1 that for inte-
gral values of m and n,
f sin mx cos nx dx — 0, in general,
— T
f eos rnx cos nx dx = 0, when m ^ dbn,
J —r
and hence integration of (18-3) yields
f f(x) cos nx dx = o n f cos 2 nx dx ** a n r.
Therefore,
1 [*
a n * - / /(x) cos nx dx.
-tr J —IT
(18-4)
By (18-2), this result is also valid ! for n - 0.
1 See Example 2 of Sec. 17.
9 That is the reason for writing the constant term as * 4 oq rather than oq .
170
INFINITE SERIES
[chap. 2
Similarly, multiplying (18*1) by sin nx and integrating yield
bn
1 r*
- / f{x) sin nx dx .
t
(18-4 a)
The formulas (18-4) are called the Eider-Fourier formula#, and the series
(18-1) which results when a n and b n are determined by the Euler-Fourier
formulas is called the Fourier series of f(x). More specifically, a Fourier
series is a trigonometric series in which the coefficients are given, for some
absolutely integrable function l f(x), by (18-4).
The distinction between a convergent trigonometric series and a Fourier series is im-
portant in the modern development of the subject and is a genuine distinction. For
instance, it is known that the trigonometric series
DO
E
sin nx
log (I + n)
is convergent for every value of x without exception, and yet this series is not a Fourier
series. In other words, there is no absolutely integrable function f(x) such that
cos nx dx
0,
sin nx dx
TT
log (1 + n)
On the other hand a series may be a Fourier series for some function /(x) and yet diverge.
Although such functions are not considered in this book, they often arise in practice,
for example, in the theory of Brownian motion, in problems of filtering and noise, or in
analysing the ground return to a radar system Even when divergent, the Fourier series
represents the main features of/(x), ami for this reason Fourier series are an indispensable
aid in problems of the sort just mentioned.
Treatises devoted to Fourier series commonly replace the sign of equality in (18-1)
by ££, or some similar symbol to indicate that, the series on the right is the Fourier
series of the function on the left. We shall continue to use the equality sign because the
series obtained in this book do, in fact, converge to the function from which the coeffi-
cients were derived.
To illustrate the calculation of a Fourier series, let f(x) ** x. By Eqs.
(18-1) i
a n = - j* x cos nx dx ~ 0,
i r
6n = -/_
TT J
2 2
x sin nx dx = cos nir * - ( — l) n+1 ,
n n
so that, upon substituting in (18-1),
— 2 ^sin x
sin 2x sin 3x
— T“ + —
(38-5)
In Sec. 24 it is shown that the series (18-5) does converge to x for — n <
x < TT. To discuss the convergence outside this interval, we introduce the
notion of periodicity. A function fix) is said to be periodic if fix + p)
1 This means that j f(x) J, as well as f(x), is integrable.
FOURIER SERIES
SEC, 18]
177
* f(x) for all values of x, where p is a nonzero constant. Any number p
with this property is a period of f(x ) ; for instance, sin x has the periods 2t,
— 2ir, 47r, ....
Now, each term of the series (18-5) has period 2 tr, and hence the sum
also has period 2 ir. The graph of the sum therefore has the appearance
shown in Fig. 11. Evidently, the sum is equal to x only on the interval
— ir < x < 7r, and not on the whole interval — oo < x < a>.
It remains to describe what happens at the points x = zbx, db37r, . . . ,
where the sum of the series exhibits an abrupt jump from — 7 r to -f-ir.
Upon setting x = dzir, rb37r, ... in (18-5), we see that every term is zero.
Hence the sum is zero, and this fact is indicated in the figure by placing a
dot at the points in quest ion.
The term a n cos nx 4- sin nx in (18-1) is sometimes called the nth harmonic (from
analogy with the theory of musical instruments). The first four harmonics of the series
(18-5) are
2 sin x, — sin 2 j, % sin 3 j, — sin 4x.
These and the next two harmonics are plotted as the numbered curves in Fig. 12. The
sum of the first four harmonics is
y « 2 sin x — sin 2x + % sin 3x — sin 4x.
Since this is a partial sum of the Fourier series, it may ho expected to approximate the
function x . The closeness of the approximation is indicated by the upper curves in Fig.
12, which show this partial sum of four terms together with the sums of six and ten
terms As the number of terms increases, the approximating curves approach y «■ x for
each fixed x on — ir < x < 7r but not for x =
The foregoing example illustrates certain features which are character-
istic of Fourier series in general and which will now be discussed from a
general standpoint. Each term of the series (18-1) has period 2ir, and hence
if /(x) is to be represented by the sum, then/(x) must also have period 2*\
Whenever we consider a series such as (18-1), we shall suppose that/(x) is
given for — ir < x < ir and that outside this interval /(x) is determined by
178
INFINITE SERIES
(CHAP. 2
the periodicity condition
f(x + 2ir) = f(x).
Of course, any interval a < x < a -f 2* would do equally well.
The term simple discontinuity 1 is used to describe the situation that
arises when the function f{x) suffers a finite jump at a point x = x 0 (see
Fig. 13). Analytically, this means that the two limiting values of f(x), as
x approaches x 0 from the right-hand and the left-hand sides, exist but are
unequal ; that is,
lim/(x 0 + e) lim f(x 0 — c), € > 0.
« — o « —• o
In order to economize on space, these right-hand and left-hand limits are
written as /(x 0 +) and /(x 0 — ), respectively, so that the foregoing inequal-
ity can be written as
/(* 0 +) 5 * /(*<>-)•
A function /(x) is said to be bounded if the inequality
!/(*) I < M
1 For an example of a discontinuity which is not simple, consider sin (l/x) near x «■ 0.
SEC. 18 ]
FOURIER SERIES
179
holds for some constant M and for all x under consideration. For example,
sin x is bounded, but the function
1
f(x) « -» for x & 0,
x
m - o,
is not, even though the latter is well defined for every value of x. It can
be shown that if a bounded function has only a finite number of maxima
and minima and only a finite number of discontinuities, then all its dis-
continuities are simple. That is, f(x+) and f(x — ) exist at every value of z.
The functions illustrated in Figs. 11 and 13 satisfy these conditions in ever; finite
interval. On the other hand, the function sin(l/jr) has infinitely many maxima near
x * 0, and as we have noted, the discontinuity at x * 0 is not simple. The function
defined by
f(x) = x 2 sin -f x 0,
x
/( 0 ) * 0
also has infinitely many maxima near x *= 0, although it is continuous and differentiable
for every value of x. The behavior of tnese two functions is illustrated graphically in
Figs. 14 and 15.
Fig. 15
180
INFINITE SERIES
[chap. 2
With these preliminaries, we can state the following theorem, which
establishes the convergence of Fourier series for a very large class of func-
tions:
Dirtchlet’s Theorem. For < x < r suppose f{x) is well defined , is
bounded , has only a finite number of maxima and, minima , and has only a
finite number of discontinuities . Let f(x) be defined for other values of x by
the periodicity condition fix + 2v) » f(x). Then the Fourier series for f(x)
converges to
y 2 [f(x+) +/(*-)]
at every value of x [and hence it converges to f(x) at points where f(x) is con-
tinuous].
The conditions imposed on f(x) are called Dirichlet conditions , after the
mathematician Dirichlet who discovered the theorem. In Bee. 24 we estab-
lish the conclusion under slightly more restrictive conditions which are
sufficient, however, for almost all applications.
Example 1. Find the Fourier series of the periodic function defined by
f(x) * 0, if -x < x < 0,
f(x) =» x, if 0 < x < x.
By (18-4) we have
0 dx J x ** x,
1 f v
a n *■ — / x cos nx dx ** 0, n > 1 ,
IT JO
1 f* 1
b n « - / x sin nx dx » ~ (1 — cos nx).
x Jo W
The factor (1 — cos nr) assumes the following values as n increases:
n =
1
2
3
4
i
5
...
(1 — cos nx) »
2
0
2
0
2
. . .
Determining b n by use of this table, we obtain the required Fourier series
■4-2
( sin
“I
x sin 3x
~ H ;r~
sin hx \
The graph of fix) consists of the x axis from —x to 0 and of the line AB from 0 to x
(Fig. 16). There is a simple discontinuity at x « 0, at which point the series reduce!
to */2. Since
1 IJLr /(°-“) 4/(04),
2 " 2
this value agrees with Dirichlet's theorem. Similar behavior is found at x ** ±x,
sfc2ir, ....
SBC. 18 ]
FOURIER SERIES
181
The figure shows the first four partial sums, whose equations are
sin 3x\
X X /
V * » -f 2 sin x, y * 2 + 2 V
sin x 4* -
^sin
. sin 3x sin 5x'
+ 2 ( sin z -f — — + — r
3 o
)
For most functions it is only the infinite series that reduces to J^j /(x~) ■+•/(* 4*) 1 at
points of discontinuity. In the present example, however, this condition is satisfied by
the partial sums, as the reader can verify. That is, the graph of each partial sum contains
the points (0,x/2), (:i:x,x/2), ....
Example 2. Find the Fourier series for the periodic function /(x) defined by
f(x) « -x, if -x < x < 0,
f(x) « x, if 0 < x < x.
The integral (18-4) may be expressed as an integration from — x to 0, followed by
integration from 0 to x. If the appropriate formula for /(x) is used in these two inter-
vals, we get
On
x cos nxdx + J x cos nx dx j
■:(£
1 / cos nx 1 \ 1 /
; (° + — - nd m Z (
cosnx — 1>
The integration assumes that n ^ 0; if n » 0, we get oq * — r/2, as the reader can
verify. Similarly,
~ ** 4“ J * sin no* dx^
- - cos nx cos nx^ ® — (1 —2 cos nr)
r \n n n / n
182
Therefore
INFINITE SERIES
(CHAP. 2
fix)
2 cos 3x 2 cos bx
n . sin 2a: 3 sin 3a?
+ 3 sin x + —
sin 4a; f 3 sin 6a;
A * f-
When x » 0, the series reduces to
x 2 /J_
4
i i
+ p +^+-
which must coincide with (soe Fig. 17)
/( 0 +)+/( 0 -)
css — •
2 2
Thus
Hence
! / 1 1 1 \
Vl J + 3 J + 5 2+ '/ :
111 jr 2
P + P + 5 S+-" " g-
This example suggests the use of Fourier series in evaluating sums of series of constant
PROBLEMS
1. Evaluate J cos mx cos nx dx for integral m and n by use of the identity
2 cos A cos B » cos {A -f B) -f cos ( A — B).
2. Find the Fourier-series expansion for f(x) } if
m - =.
for — r < x <
2
m - o.
, r
for - < x < w.
m - -x.
for — ir < x < 0,
f(x) - 0,
for 0 < x <vr t
6. In Probfl. 2 to 5, sketch the graph of the function to which the Fourier series con-
verges in the range — 4ir < x < 4v,
19* Even and Odd Functions. For many functions the Fourier sine or
cosine coefficients can he determined by inspection, and this possibility is
now to be investigated. A fund ion f(x) is said to be even if
f(-x) s /(*), (19-1)
and the function fix) is odd if
/(-*) - —/(*)• ( 19 - 2 )
For example, x 2 and cos x are even, whereas x and sin x are odd. The
graph of an even function is symmetric about the y axis, as shown in Fig.
18, and the graph of an odd function is skew symmetric (Fig. 19). By
inspection of the figures it is evident that
f f(x) dr = 2 f f(r) dr
if f{x) is even,
(19-3)
f f{x) dx — 0 if /(x) is
J —a
odd,
(19-4)
Flo. 18
184
INFINITE SERIES
[CHAP* 2
Fia. 19
since the integrals represent the signed areas under the curves. 1 For
example,
f sin nx dx = 0,
J —a
since sin nx is an odd function.
Products of even and odd functions obey the rules
(even) (even) = even, (even) (odd) = odd, (odd) (odd) = even,
which correspond to the familiar rules
(+1X+D « +1, (+i)(-D - -l, (-D(-l) - +L
For proof, let F{x) — f(x)g(x) y where /(x) and g(x) are even. Then
F(-x) = f(~x)g(~x) * f(x)g(x) * F(x) f
which shows that the product f{x)g(x) is even. The other two relations
are verified similarly. As an example, the product cos nx sin mx is odd
because cos nx is even and sin mx is odd. Hence, (19-4) gives
f cos mx sin nx dx «* 0
J —a
without detailed calculation.
The application of these results is facilitated by the following theorem :
Theorem. If /(x) defined in the interval — v < x < ir is even , the Fourier
series has cosine terms only and the coefficients are given by
2 /■*■
a n * - / /(x) cos nx dx y b n » 0. (19-5)
t y o
1 An analyse proof of (19-3) and (19-4) may be based on (19-1) and (19-2).
INFINITE 8EEIES
186
[chap. 2
The function to which this series converges is illustrated in Fig. 22, and the sum of the
series (19-7) is presented graphically in Fig. 11.
Since \x\ — x for x > 0, the two aeries (19-7) and (19-8) converge to the
same function x when 0 < x < tt. The first expression (19-7) is (‘ailed the
Fourier sine series for x , and (19-8) is the Fourier cosine series. Any func-
tion fix) defined in (0,ir) which satisfies the Dirichlet conditions can be
expanded in a sine series and in a cosine series on 0 < x < r. To obtain a
sine series, we extend fix) over the interval — it < x < 0 in such a way that
the extended function is odd. That is, we define
Fix) = fix) on 0 < x < r,
Fix) = — /( | x | ) on — 7r < x < 0.
The Fourier series for Fix) consists of sine terms only, since Fix) is odd.
And the coefficients are given by (19-6) because Fix) = fix) on the interval
0 < x < 7r. Similarly, if it is desired to obtain a cosine series for fix) on
0 < x < the coefficients are given by (19-5).
Example' Obtain a cosine series and also a sine series for sm x.
For the cosine series (19-5) gives b n » 0 and, after a shoit calculation,
2 [ T . 2(1 -f cos vie)
a„ » - I smxcosnxax * — » n ^ 1.
tt Jo ir(l — n z )
For n ** 1 the result of the integration is zero, and hence
2 4 / cos 2 x cos 4 x cos Ox \
sm x I — + b I
TT \2 2 ~ 1 4 2 - 1 6 2 - 1 /
when 0 < x < ir. Since the Rum of the series is an even function, it converges to | sin x |
rather than sinx when — t < x < 0. This shows, by periodicity, that the series con-
verges to | sin x | for all values of x.
To obtain a sine series, (19-6) gives a n ** 0 and
, 2 r. . . jo,
b n ** - / sm x sm nx ax ** <
T Jo ll»
for n > 2,
for n » 1.
Hence the Fourier sine series for sin x is sin x, just as one would expect. That this is
not a coincidence is shown by a Uniqueness Theorem: If two trigonometric series of
FOURIER SERIES
SEC. 20]
187
the form (18-1) converge to the tame eum for all values of x, then corresponding coefficients
are equal}
PROBLEMS
1. Classify the following functions as even, odd, or neither:
1 4- x
x 2 , x sin x, x * cos nx t x A % log » e*, g(x % ).
1 — x
2. Prove that any function can be represented as the sum of an even function and an
odd function. Hint: f(x) = l A[f(x) -f /(-*)] + t 2 [/(x) ~
3. For 0 < x < * show that
x sm 3 j sin 5 x
_« 81ni + __ + ___ i .
Hint ‘ Take/(j) « *74 in (19-6)
4. A function is defined by /(x) = xforO < x < x/2and//» = 0 elsewhere in (—x,x).
Find the Fourier series, the Fouriei sme series, and the Fourier cosine series. In each
case sketch the graph of the sum of the series lor —4 it < x < 4tt.
6. By taking /(x) — x 2 in (19-5^ show that
+ 4l(-ir !
for — x < x < x, and deduce that
x 2 1
12 ^
l jl _ 1
i 2 ~ 2* + 3 2 4 2 + ’
6. Show that if f(x) ~ x for 0 < x < x/2 and f(x) = x — x for x/2 < x < x, then
/W
( cos
7
cos 2x cos fix cos 1 Ox
+ — 2 - + --^-+-
7. Show that for — x < x < x,
sin ira
COS Otx
when a is not an integer. Deduce
HU -A 2 a sin x«
— + Z “I) .- r — ,-cosnx,
* i ir(a 2 - n 7 )
1 (\ * 2a \
cot X« *= — I - — '•> 9 I *
x \a n*i W 2 - <X 2 /
20. Extension of tbe Interval. The methods developed up to this point
restrict the interval of expansion to (~7r,7r). In many problems it is de-
sired to develop /(.r) in a Fourier series that will be valid over a wider inter-
val. By letting the length of the interval increase indefinitely one may
expect to get an expansion valid for all x.
1 This theorem is due chiefly to Riemann. It is much deeper than the analogous state-
ment for power series, and the proof would be quite out of place in the present book.
See E. C. Titchmarsh, “The Theory of Functions,” pp. 427-432, Oxford University
Press, London, 1950.
INFINITE SERIES
IS® INFINITE SERIES [CHAP. 2
To obtain an expansion valid on the interval (~-2,Z), change the variable
from x to fe/ir. If f(x) satisfies the Dirichlet conditions on (—2,2), then the
function f(lz/v) can be developed in a Fourier series in z,
0 do * A
— h 2^ a n cos n 2 + 2^ b n sm nz (20-1)
2 n—l n*»l
for — t < 2 < v. Since 2 — t rx/l } the series (20-1) becomes
Oq * nvx * mrx
t fix) = 2^ a n cos — + 2 ^ Sin
2 u «=1 2 n*l 2
(20-2)
By applying (18-4) to the series (20-1) , we see that
«n
1 r* fh\ 1 ri
- I /I — ) cos nz dz = - I f(x)
V \7 T/ l
nvx
cos dx
l
mrx
dx -f
f 2
njrx \
1
0 * cos
/ 1
* cos dx ]
I rm — —
i 2
h
2 /
nr
. mrx
dx +
nvx \
1
0 • sin — -
/ 1
. sln
~dx\
sm j
5 2
/o
2 /
nv
1 , 2 /
ITT
1
3ttj
1 .
&vX
. mrx 2
sm
2 o
1 o,
1 r* (lz\ 1 ri nvx
and b n * - / /I — } sin nz dz ~ - I f(x) sin dx.
\7r/ l l
As an illustration wc develop f(x) in Fourier series in the interval (—2,2) if f(x) *» 0
for —2 < x < 0 and f(x) ** 1 for 0 < x < 2. Here
-Kfj
"•-Kfj
Therefore,
f
If n is any integer, then
ntr(x + 21) (nvx
cos ~ cos — h 2mr
• COS nir).
; sm -
5 2
)■
) (nvi\
* 008 \ T/
and similarly for sines. Hence, each term of the series (20-2) has period
22, and therefore the sum also has period 22. For this reason the sum can-
not represent an arbitrary function on ( — it represents periodic func-
tions only.
Subject to the Dirichlet conditions, however, the function may be chosen
arbitrarily on the interval (—2,2), and it is natural to inquire if a representa-
tion for arbitrary functions on ( — qo,qo) might be obtained by letting 2 — »
We shall see that such a representation is possible. The process leads to
FOURIER SERIES
SEC. 201
the so-called Fourier integral theorem, which has many practical applica-
tions. 1
Assume that f(x) satisfies the Dirichlet conditions in every interval ( —1,1)
(no matter how large) and that the integral
M-f \f(x) | dx
converges. As we have just seen, f(x) is given by (20-2), where 2
1 f l nwt 1 ri nwt
a n = 7 I fit) cos — dt, bn * ; / f(t) sin — dt.
I 1 l l J—i l
Substituting these values of the coefficients into (20-2) gives
1 ri 1 *
/(*) I dt + i^ I
cos -
mr(t — x)
l
dt
when we recall that
nwt utx nwt nwx nw(t — x)
cos — cos j- sin — sin =* cos
till l
Since
|/(j) | dx is assumed to be convergent,
1
21
M
<- C 1/(0 1 dt <
2 l J -r ' 21
(20-3)
which obviously tends to zero as l is allowed to increase indefinitely. Also,
if the interval ( — 11) is made large enough, the quantity w/l, which appears
in the integrands of the sum, can be made as small as desired. Therefore,
the sum in (20-3) can be written as
1 ft
- [A a I f(t) cos Aa(t — x) dt
w J— 1
AaJ f(t ) cos 2 A a(t — x) dt
+
+ Aa f_ f(t) cos ?? Aa(t — x) dt
+ ], (20-4)
where Aa — w/l.
1 Some of these applications are presented in Chap. 6.
* We use t as variable of integration to avoid confusion with the x in (20-2). If /(a?)
is discontinuous at x ** xo, the left side of (20-2) means ^[/(zo-f) 4* /(^o — )!•
190
INFINITE SEKIES
[CHAP. 2
This sum suggests the definition of the definite integral of the function
F(a) = J *f(t) cos a(t — x) dt
in which the values of the function F(a) are calculated at the points
7T 2 7T 3 7T
7 ~7' T'
Now, for large values of l
J f(t) cos a(t — x) dt
differs little from
J f(t) cos a(t — x)
J — on
dt
and it appears plausible that as l increases indefinitely, the sum (20-4) will
approach the limit
1 r» rco
-I da f(t) cos a[t — x) dt.
71* • 0 * — oo
If such is the case, then (20-3) can be written as
f(x) = - r° da r f(t) cos a(f — x) eft. (20-5)
7T 1 0 * — oo
The foregoing discussion is heuristic and cannot be regarded as a rigorous
proof. However, the validity of formula (20-5) can be established rigor-
ously 1 if the function /(x) satisfies the conditions enunciated above. The
integral (20-5) bears the name of the Fourier integral.
Formula (20-5) assumes a simpler form if /(x) is an even or an odd func-
tion. Expanding the integrand of (20-5) gives
1
7 r
cos at cos ax dt + / /(/)
J — oo
sin at sin ax dt
for the right-hand member. If /(/) is odd, then Jit) cos at is an odd func-
tion times an even function, hence is odd. Similarly, f(t) sin at is even
when/(0 is odd. Upon applying (19-4) to the first integral in the foregoing
expression and (19-3) to the second integral, we see that
/(*)
2
T
sin at sin ax dt
(20-6)
1 See H. S. Carslaw, “Fourier’s Series and Integrals/’ pp. 283-294, The Macmillan
Company, New York, 1921, or E. C. Titchmarsh, “The Theory of Functions/’ p. 433,
Oxford University Press, London, 1950.
me, 20] FOURIER SERIES
when/(x) is odd. A similar argument shows that if f(x) is even, then
191
f(x) — d«j£V(0 cos at cos ax dt. (20-7)
If f(x) is defined only in the interval (0 ,qo), then both (20-6) and (20-7)
may be used, since f(x) may be thought to be defined in ( — <*>,0) so as to
make it either odd or even. This corresponds to the fact that a function
given on (0,ir) may be expanded in either a sine series or a cosine series.
Since the Fourier scries converges to l /i\f{x+) + /(x — )] at points of dis-
continuity, the Fourier integral does also. In particular for an odd func-
tion 1 the integral converges to zero at x — 0, and this fact is verified by
setting x = 0 in (20-6).
Example. By (20-7) obtain the formula
ir/2, if 0 < x < 1,
ir/4, if X = 1,
0, if x > 1.
We choose /(x) * 1 for 0 < x < l and /(x) » 0 for x > 1. Then
/** . . f x , sin a
I J \ 0 cos at dt = / COR at dt * j a ^ 0.
Jo Jo a
Substitution into (20-7) gives
f* 1 sin a r
I COS ax da ~ - f(x)
Jo a 2
after multiplying by rr/2. Upon recalling the definition of /(x), we see that the desired
result is obtained for 0 < x < 1 and for x > 1. The fact that the integral is *r/4 when
x » 1 follows from
1 r /0 + /0+)
2 “ 2
r sm
sin « cos ax
da
PROBLEMS
1. If /(x) is an odd function on (—/,/), show that the Fourier series takes the form
V- , nVX , 2 [ l r, v N*\X ,
f(x) ~ 2^ sm — » b n » - / /Or) sin -7- dx.
n—l l l Jo i
Similarly, if /(x) is even, then
. \ i X"' 7lJTX 2 TLirX
f(x) « ““ + 2^ a n cos — > a n « 7 / /(x) cos — dx.
* n*«l t l Jo l
1 It should be noted that every odd function, if defined at x * 0, satisfies /(0) *» 0
[although for an even function /(()) is arbitrary]. Hence a function defined for x » 0
must sometimes be redefined at x » 0 before it can be made into an odd function.
m
INFINITE SERIES
[CHAP. 2
2* Expand the function defined by fix) * 1 on (0,2) &nd/(x) ** — 1 on ( —2,0).
3. Expand /(x) * |x| in the interval (—1,1).
4. Expand /(x) » cos *x in the interval ( — 1,1).
5. Find the expansion in the series of cosines, if
fix) « 1, when 0 < x < t,
f(x) 0, when *• < x < 2r.
Hint: Regard /(x) as being an even function.
0. Expand
f(x) ■» — x. if 0 < x <
f(z) *■ x — if }i < x < 1 .
7. Show that the series
l • 1 2/iTTX
- " «« ~r '
7r narl n l
represents x Al — x when 0 < x < L
8. Show, with the aid of (20-6) and (20-7), that
r at sin ax , ir „
«£* « - e^ 1 , if ft > 0,
. a 2 + /S 2 2
r cos ax , 7T
Hint: Take /(x) =
8. An integral equation is an equation in winch an unknown function appears under
an integral sign. If F(t) is known and fix ) is to be found, the integral equation of F ourier is
I fix) cos xt dx ** F(t).
J o
(a) Using (20-7) show that a solution is given by
fix) — - j Fit ) cob xt dt.
T Jo
ib) State a similar integral equation which can be solved by use of (20-6), and
solve it.
21. Complex Form of Fourier Series. The Fourier series
u o
f(x) =* f- 2Lf ( a n c; os nx + b n sin nx)
2 n* 1
with a n — - f f(t) cos nt dt , K ~ ~ f /( 0 si
7T * 7T ' *
can be written, with the aid of the Euler formula 1
sin nZ dt
e lu « cos u + i sin n
( See Sec. 17.
SBC. 21] FOURIER SERIES
in an equivalent form, namely,
/(*) - E Cn& in *t
<0
where the coefficients c n are defined by the equation
e„ = ^ r f(t)e~ int dt.
2r •'-*
193
(21-3)
(21-4)
The index of summation n in (21-3) runs through the set of all positive and
negative integral values including zero.
The equivalence of (21-3) and (21-1) can be established in the following
manner: Substituting from (21-2) in (21-4) gives, for n > 0,
c n ~ — r os nt — i sin rd) dt
2 ir
1 pr i rr
-- — / /(*) cos nt dt / /($) sin eft
2tt ■'-* 27r ■'-*
.&n
¥ 1 "2**
A similar calculation gives
. b«
C_ n =
2 2
Oo
2”
while c 0
Now (21-3) can be written in the form
oc 00
/(*) = c 0 + E c n c ,MI + E c_ n c- <n *.
n=aal — 1
n* 1
Making use of the expressions for the c„ just found gives
fix)
a 0 . ^ a n ~ ib n wx
e- + £
* n=»l
2 + E
•* n ~1
o„ e’ ni + e~~' nx
- + E «« x
“ n as 1 “
On + ?'&n
~ ' E & n
n«*l
»*** _ ~ — tnx
By (21-2),
e* w _|~ a* 2 cos u and e tu — e~ tu = 2f sin t*
and hence the latter series is identical with (21-1).
194 INFINITE SERIES [CHAP. 2
To illustrate the use of (21-4), consider the function f(x) » e ax on (— ir,ir). Here,
2rrc,
-r,
f" t e ~,n( dl
■ f e‘“"
J — r
">‘dt
e (a— w)t jir
_ e ~
a in \ -x
Since (21-2) gives e ±tn * * cos (±mt) ** (*-l) n , we obtain
(-l) n SinllTra (~l) n
c n «
and hence by (21-3),
a — m
smh ira
~ Tj" 2 ^ + m )
-f tr
ir n «.--«o + W
The methods of the last section yield
£ „8 + i ' ! ) £,m:c -
(21-5)
/(sr) = E c n e m * xl ‘, with r„ = ~ dt, (21-0)
for the expansion on an arbitrary interval ( — Z, /) Upon letting Z co, we
obtain the Fourier integral theorem in the form
A*) * lim ~ f da f f(()e
A * 2tt J ~ A J — «
,ta(z— 0
(ft
when/(x) satisfies the conditions postulated for (20-5).
If
0(«) = -4= f ^ lw 7(^)
V27r y — 00
dr,
then (21-7) gives, after renaming some of the variables,
/•A
a * \/2i r
The transform T defined by
1 [A
/(x) = lirn — / F ux g{u) du .
a / 0*rr J — '
(21-7)
( 21 - 8 )
(21-9)
t(/) - ~ 4 r r e ~' ux l^ dx
V2t
is called the Fourier transform; it is one of the most powerful tools in the
whole repertoire of modern analysis. Although T is related to the Laplace
transform L introduced in Appendix B, T is much easier to invert; that is,
one can readily find A#) by (21-9) when T (J) is knowm
PROBLEMS
1. Derive (21-6) from (21-3) and (21-4)
2 . (a) Show that
Oq » 2C0, a n * Cn + &« “ *(C « — C-~n)
[and hence the real form (21-1) can be deduced from the complex form (21-3)].
( b ) By applying your result (a) to (21-5) obtain the real Fourier series (21-1) for e**.
SEC. 22] ADDITIONAL TOPICS IN FOURIER SERIES
3. By setting x « 0 in (21-5) obtain the expansion
v ^ <-l) n
— - as y — - ..
a sinh Ta n «— «> 4* n 2
ADDITIONAL TOPICS IN FOURIER SERIES
22. Orthogonal Functions. A sequence of functions B n (x) is said to be
orthogonal on the interval (a, b) if
f 6 m (x)e n (x)
Jn
for m 5 * n,
for m = n.
r B m (x)B n (x) dx = f sin mx sin nx dx =
It r/:
For example, the sequence
0i(x) « sin x, 0 2 (x) = sin 2x, . . B n (x) = sin nx,
is orthogonal on (0,tt) because
r r* . (0 for m t* n y
B m (x)Q n (x) dx = / sin mx sin nx dx =
•'o l tt/2 for m = n.
The sequence
1, sin x, cos x, sin 2x, cos 2x, ... (22-2)
is orthogonal on (0,2x), though not on (0,7r).
In the foregoing sections the functions (22-2) were used to form Fourier
series. Actually, one may form series analogous to Fourier series hv means
of any orthogonal set. These generalized Fourier series are an indispensa-
ble aid in electromagnetic theory, acoustics, heat flow, and many other
branches of mathematical physics. 1
The formula for Fourier coefficients is especially simple if the integral
(22-1) has the value 1 for m -= n. The functions B n (x) are then said to be
normalized, and {0 n (x)| is called an orthonormal set. If
f [0 K {x)f ,
in (22-1), it is easily seen that the functions
4>nM = (4 n)~^B n (x)
are orthonormal; in other words,
rb
I )<f> n (x) i
Ja
= 0 for m ^ n,
= 1 for m = ft.
For example, since
r2v r‘2* /*2ir o
/ 1 dx - 2tt, / sin 2 nx dx ~ 7r, / cos“ nx dx =» tt
Jo Jo Jo
1 See Chap. 6.
INFINITE SERIES
m
[chap* 2
for n > 1, the orthonormal set corresponding to the orthogonal set (22-2) is
(2**)"“^, tt~ h sin x, cos x, , . sin nr, cos nx,
The product of two different functions in this set gives zero but the square
of each function gives 1, when integrated from zero to 2ir,
Let {<f> n (z) } be an orthonormal set of functions on (a, 6), and suppose that
another function /(x) is to be expanded in the form
J{x) = Ci4>i(x) + c 2 <t> 2 (x) -f f* Cn4>«(*) 4 . (22-4)
To determine the coefficients c n we multiply by <t> n (x), getting
f(r)<t>n(x) = c y <t> i(x)0n(x) H f c n [<A„(x)] 2 H .
Here, the terms not written involve products 4> n (x)4> m (x) with rn ^ n. If
we integrate from a + o b , these terms disappear, and hence
f f(x)<f> n (x) dx ~ f c n [<j> n (x)] 2 dx = c n (22-5)
-/a Ja
According to Theorem III, Bee 7, the term-by-term integration is justified when the
series is uniformly convergent and the functions are continuous The foregoing pro-
cedure shows that if f(x) has an expansion of the desued type, then the coefficients r n
must be given by (22-5). In the following section (22-5) is obtained in a different manner,
which does not assume uniform convergence
The formula (22-5) is called the Euler -Fourier formula, the coefficients
c n are called the Fourier coefficients of /(.r) with respect to {</>„(x)J, and the
resulting series (22-4) is called the Fourier senes of f(x) with respect to
{ <f> n ( K x ') } . The reader can verify that the foregoing results applied to the
sequence (22-2) >ield the ordinary Fourier series, as described in the fore-
going sections.
Orthogonal sets of functions are obtained m practice by solving differen-
tial equations, and this possibility will be discussed next On a given inter-
val a < x < b consider the equation
d
dx
+ g(*)y = M*)y,
A ~ const,
(22-6)
or, in abbreviated form,
(py'Y 4- qy - \ry,
d
d~x
It will be convenient to require the additional condition
f ry 2 dx j* 0
which, in particular, rules out the trivial solution y ^ 0.
SEC, 22] ADDITIONAL TOPICS IN FOURIER SERIES 197
Let y m be a solution when X has the value \ m , and let y n be a solution
when X has a different value, X n . Thus,
(py'mY + QVm “ X m rj/„„ (22-7)
(.py'nY + qy n = x„n/„. (22-8)
If (22-7) is multiplied by y n and (22-8) by y mi we get
Vnipi/mY - VmipynY * X w r// m it/ n - \ n ry m y n (22-9)
after subtracting the resulting expressions. Since
fl/*.(W«) ^ ?/m(?Vn)]' = VnipUmY 4* IfnWm) ~ VmipyhY
=• left side of (22-9),
the foregoing result (22-9) may be written
rf
*T~ [p(2/n^/m 2/m?/n)] ^ (X*n X n )n/ w 2/ n .
ax
y’mivy'n)
Integrating from a to b yields the fundamental formula
when r is continuous.
If the conditions at a and b are such that the loft side of
we can deduce
( 22 - 10 )
(22-10) is zero.
/:
Wml/n djT = 0,
m n,
( 22 - 11 )
since X m ^ X w . The relation (22-1 1) may be written
( ( V'r (/,„)( V r y n ) d.r = 0, m 9 * n,
■fa
and hence the sequence d u (x) defined by
^ l/n * vV(x)2/ w (x) (22-12)
satisfies the orthogonality criterion (22-1 J. An orthonormal set {<£ n j may
be obtained from {0 n j as described previously.
When r(x) is negative, the foregoing prows does not yield a real sequence { 0 n (x ) } , and
it is better to work directly with (22-11). Functions y n satisfying (22-11) are said to be
orthogonal unth respect to the weighting function r(x) , the definition (22-1) corresponds to
the case r be 1. Fourier series based on the more general concept of orthogonality (22-1 1)
are quite analogous to those based on (22-1) (cf. Frob 2)
Example l Show’ that the sequence (22-2) is orthogonal on the interval ( — t,*-).
Since sin nx and cos nx satisfy y" *=* - n l y, we may use the formula (22-10) with
p»r» 1, The result is
(VnV'm ~ VmVn)
( — to 2 -f* n 2 ) f y m y n dx,
• — T
(22-13)
198
INFINITE SERIES
[chap. 2
where y n » sin nx or cos nx and y m « sin mx or cos mx. Since y % Vm — has period
2*, the value at % is the same as the value at —rr, and hence the left side of (22-13) is
zero. This yields the desired orthogonality except in the case m « n. If ?n « n, how-
ever, the relevant integral may be evaluated by inspection:
/:
cos nx sin nx dx
sin 2 nx dx
0.
Example 2. Show that the Legendre functions P n (x) are orthogonal on the inter-
val (-1,1).
Legendre’s equation (13-8) may be written
[(1 - *V)' - xp,
where X is constant; X * — n(n -f- 1) when y » P n ( x). The special case p «* (1 — x 2 ),
q o, r « 1 in (22-0) and (22-10) yields
(1 - x 2 )(P n P; - P m K) f - f — m(m + 1 ) + «(w + D1 I' PmU)Pn(Jr) dx
l-i
Since (1 — x 2 ) vanishes at dbl, the left side is zero and the oithogonahty follows. It can
be shown 1 also that ^
f IP„(r)] 2 dx = — T”7’
2n -f 1
and hence the corresponding orthonormal set is
<Pn(x) * (« • f 1 2)‘^Pn(x)
Example 3. l*et the sequence p\, p 2 , • - • be the distinct positive roots of the equation
Jp(x) « 0, so that J^Pn) =* 0. If ju > 0 the functions
4>n(x)
v-r/ ,JpnT)
are orthonormal on the interval (0,1).
By (14-1) it is found that y — J^ipx) satisfies
J'M)
( 22 - 1 1 )
m 2 v
Xxy,
and hence (22-10) holds with p * r » x. If we clioose » J>(px) and t/ m ** ^(pmx),
the left side of (22-10) is
\ d d l I 1
x L ./ M (PX) J p(pmX) J p(pmX) ^~Jp(pX) J | ,
which reduces to JpijfipmJlSpm), since J»(p m ) * 0. It follows that
(~~Pm ~f~ p 2 ) f xJ,(p m x)JM) dx «* PmJ p(p)J n(pm) •
J 0
(22-15)
1 See E. J. Whittaker and G. N. Watson, “Modern Analysis/’ p. 305, Cambridge
University Press, London, 1952; J. M. Macltobert, “Spherical Harmonics/’ p. 92, Dover
Publications, New York, 1948; W. E. Byerly, “Fourier Series and Spherical Harmonics/’
p. 170, Ginn & Company, Boston, 1893.
SRC. 22 ] ADDITIONAL TOPICS IN FOURIER SERIES
Since / M (p n ) ** 0, the choice p « p„ in (22-15) yields
I Xj p{pm^)J n{.PnX) (lx — 0, 71% 5^ 71.
JO
Moreover, differentiating (22-15) with respect to p we get
2p f xJ dx *j~ (p 2 Pm) f X~J n(p m x)J ^(px) dx PmJ fi{p)J JjiTn) ,
Jo Jo
which reduces to l
2 [ x[J,( Pm x)f dx - f/;(p m )] 2 (
J o
when p » p m . Equations (22-16) and (22-17) show that the sequence (22-1 -1) is ortho-
normal on (6,1), as desired.
The fact that the equation / M (x) ■= 0 has infinitely many roots p n is established in
treatises on Bessel functions; analysis of such questions for general differ entail equations
constitutes the so-called Sturm-Lio wnlle theory. It can be shown that Fourier senes of
Bessel or Legendre functions actually converge; that is, an analogue of Diriehlet’s the-
orem holds in such cases. 1 These questions are treated, from a very general point of
view, m a branch of analysis known as spectral theory.
PROBLEMS
1. By considering the equation y" * \y show that the sequence sin riwx ! l is orthogonal
on the interval (0,0, and construct the corresponding oithonormal sot.
2. Suppose an arbitrary function f{x) is expanded in a uniformly convergent series
f(x) « wC n t/ 7 iW, whore y n are the functions m (22-11) Show that
Cn ** ^ r VMyn(jr) dx^ r(x)\y n (x)] 2 dx^ .
Hint Multiply the given senes by i{x) y ti {x), and integrate term by term.
3 . If m -J- n is positive, show that.
(m 2 - rr) f x~\r tn {x)Ju(x) c lx « D ~ Sm(l)Jn(l)l
Jo
Hint * Bessel's equation (1 1-1) may be written
X
{xy ) + xy = - y,
x
where X « n 2 when y — JJx) To avoid difficulty at x * 0 one may consider j and
let e —» 0. The convergence follows from (1 1-0), since (11-9) gives
Jm{x)J n (x) ^ (const) x m+n , an x — » 0
4 . It can be shown that as |/| — ♦
[J
/ 7 r 7ir\
J&) ~
1 2 . (
it m r\
J— COS 1
V ra*
V ~ 4 ~ TJ '
j — *
Cr m v
1
»■* i
1
1 E, A. Coddington and N. Levinson, “Theory of Ordinary Differentia] Equations,”
chap. 7, McGraw-Hill Book Company, Inc.. New Xork, 1955
200 INFINITE SERIES [CHAP, 2
By letting / « in Prob. 8, deduce
r 2 r
x~ ] l J m (x)J n (x) dx * - sin (rn — ») ~ > to 4- n > 0.
7T 2
6 . If <f>n(x) are orthonormal on the interval (0,1), show that
^r«W * 0“ ! Vn(x/0)
are orthonormal on the interval (0,n)
23. The Mean Convergence of Fourier Series. If we try to approximate
a function /(x) by another function 7 ? n (.r), the quantity
!/(*) - />n(x)| or [/(x) - p„(x)] 2 (23-1)
gives a measure of the error in the approximation. The sequence p n {x)
converges to f(x) whenever the expressions (23-1) approach zero as n — > 00.
These measures of the error are appropriate for discussing convergence
at any fixed point x. But it is often useful to have a measure of error which
applies simultaneously to a whole interval of x values, a < x < b. Such
a measure is easily found if we integrate (23-1) from a to b:
f !/0) - PnMi dx or f f f{x) - p„(x)] 2 rfx. (23-2)
Ja Ja
These expressions are called the ?nean 1 error and mean-square error , respec-
tively, If either expression (23-2) approaches zero as n co, we say that
the sequence p„(j) converges in mean to f(x) and we speak of mtan con -
vergerwe .
Even though (23-2) involves an integration which is not present in (23-1),
for Fourier series it is much easier to dbcuss the mean-square error and the
corresponding mean convergence than the ordinary convergence. Such a
discussion is presented now.
Let 4>„{x) be a set of functions normal and orthogonal on a < x < b, so
that, as in the last section,
f h
I 10m (*r) dx
Ja
0 for m n f
1 for m = n.
(23-3)
We seek to approximate f(x) by a linear combination of 0„(x),
Pn(x) * «i0i(t) 4- a 2 </> 2 (x) 4 b a n 4>„(x),
in such a tvay that the mean-square error (23-2) is minimum: 2
1 Note that if the expressions (23-2) are multiplied by 1/(6 — a), we get precisely the
mean values of the eoi responding expulsions (23-1).
4 We use / and <f> n as abbreviations for f(x) and <J> n (x), respectively. It is assumed that
/ and <t> n are integrable on a < x < 6. If the integrals are improper, the convergence of
fb rb
I / 2 dx and / ef>% dx is required.
Ja Ja
sec. 23]
ADDITIONAL TOPICS IN FOURIER SERIES
201
rb
& m J If ~ (°i*i H h «n *«)] 2 dx « min, (23*4)
Upon expanding the bracket we see that (23-4) yields
f f 2 dx — 2 f 4 h a n <£ n ) dr + f (a^ -j |~ a n 4> n ) 2 dx .
•/a •'a Jq,
If the Fourier coefficients of / relative to <#> fc are denoted by
Ck = ( f<t>k dx,
J a
then the second integral (23-5) is
(23-5)
rb
I /(^1</>1 + * * ' + ^n4>n) dx = dyCi + a 2 C 2 + * * * + a n C n .
J a
The third integral (23-5) may be written
rb
j (d\<t > i + * * • + &n<£n)(& 1^1 + • * * + O'n^n) dx
Ja
= (af 0i + + * — h a 2 tf> 2 + • * • ) dx
= af +■ * • * -f- a 2 ,
where the second group of terms “H — > ” involves cross products <f>^ ; with
i 7* j. By (23-3) these terms integrate to zero, and the expression reduces
to the value indicated.
Hence, (23-5) yields
E s f f dx - 2 2 a k c k + £ (23-6)
a A*- 1 k sw 1
for the mean-square error in the approximation. Inasmuch as
-2 a k c k + a* s -cj[ + (a* - c*) 2 ,
the error E in (23-6) is also equal to
£ = /V 2 dx - £ cl + £ (a* - c t ) 2 . (23-7)
*' rt A- l
and we have established a theorem of central importance:
Theorem I. If <p n Is a set of normal and orthogonal functions , the mean-
square error (234) may he written in the form (23-7), where c k are the Fourier
coefficients of f relative to 4> k -
By going back and forth between the two expressions (234) and (23-7),
one obtains a number of interesting and significant theorems with the
greatest ease. In the first place, the terms ( a k — c k ) 2 in (23-7) are positive
INFINITE SEBIE8
202
[chap. 2
unless dk « Ck , in which case they are zero. Hence the choice of a k that
makes E minimum is obviously a * = c kf and we have the following:
Corollary 1 . T/ic partial sums of the Fourier series
Ci4>i + • — h c n (j> nj
rb
Ck
■ Jj4>kdx
ro
give a smaller mean-square error / (/ — p n ) 2 dx than is given by any other
J a
linear combination
Pn ~ -{“••• -f- a n <t> n >
Upon setting a k = c k in (23-7), we see that the minimum value of the
error is
min
n
Ed
(23-8)
Now, the expression (23-4) shows that E > 0, because the integrand in
(23-4), being a square, is not negative Since E > 0 for all choices of a k ,
it is clear that the minimum of E (which arises when a k = c k ) is also > 0
The expression (23-8) yields, then,
( f 2 dx - > 0 or T.4< ('f 2 dx.
Ja k~* 1 k—A Ja
Upon letting n — * awe obtain 1
fb
Corollary 2. If c k ~ I f<t> k dx are the Fourier coefficient s of f relative to the
Ja
orthonormal set <t> H then the series LYjr converges and satisfies the so-called Bessel
inequality
CC b
E 4 < / f/Wf dx. (23-9)
*~1 J<1
Because the general term of a convergent series must approach zero
(Sec. 1, Theorem I) we deduce the following from Corollary 2:
fb
Corollary 3. The Fourier coefficients c n — / /<£ n dx lend to zero as n -- ► oo.
J a
For applications it is important to know whether or not the mean square
error approaches zero as n oo. Evidently the error approaches zero, for
some choice of the a k s, only if the minimum error (23-8) does so. Letting
n — > oo in (23-8) we get the so-called Parscval equality
f f dx - E c* = 0
Ja
as the condition for zero error:
1 Since c* > 0, the sequence 2 r l is nondecreasing. We have just seen that it is
bounded, and hence the limit exists by the fundamental principle (Sec. 1).
ADDITIONAL TOPICS IN FOURIER SERIES
203
sec. 23]
Corollary 4. Iff is approximated by the partial sums of its Fourier series,
the meaursquare error approaches zero as n — ► <x> if f and only if, BesseVs
inequality (23-9) becomes ParsevaVs equality
23 c l = f [/(• r ) ] 2 dx. (23-10)
n=t Ja
In other words, the Fourier series converges to / in the mean-square
sense if, and only if, (23-10) holds. If this happens for every choice of /,
the set 4> n (x) is said to be closed . A closed set, then, is a set that can be
used for mean-square approximation of arbitrary functions. It can be
shown that the trigonometrical functions cos nx and sin nx are closed on
0 < x < 2t r, though the proof is too long for inclusion here 1
A set <t> n (x) is said to be complete if there is no nontrivial function 2
f(x) which is orthogonal to all of the <£ n s. That is, the set is complete if
Ck * f f(x)<l>k(x) dx - 0 for k = 1, 2, 3, . . (23-11)
implies that
A/(t)] 2 dx « 0. (23-12)
J a
Now, whenever (23-10) holds, (23-11) yields (23-12) at once. Hence we
have :
Corollary 5. Every closed set <p n (x) is complete.
The converse is also true Every complete set is dosed This converse, However, requires
a more general integral than that of Ricmann The generalized integral is known as
the Ijebesgue integral; it was first constructed to deal vwtli this very pioblem. A brief
description of the Lebesgue integral is given m Appendix C
The notions of closure and compUU tui ss have simple analogues in the elementary the-
ory of vectors Thus, a set of vet tots Vi, Vi, Vg is said to be dosed if every vector V can
be written in the form
V - fiVi + r 2 V 2 + mV*
for some choice of the constants c k The set of vectors Vj, V 2 , V 3 is said to be complete
if there is no nontrivial vector oithogonal to all of them That is, the set is complete if
the condition
V-V* - 0 for * « 1, 2, 3
implies V*V ** 0.
In this setting, it is obvious that closure and completeness are equivalent, for both
conditions simply state that the three vectors Vj, V 2 , Vg are not coplanar. These matters
are taken up more fully in Chap. 4.
1 See E. O. Titchmarsh, “The Theory of Functions," p. 414, Oxford University Press,
London, 1950
* In the theory of mean convergence fix) is regarded as trivial if f(x) * 0 for so many
values of x that
dx
0 .
204
INFINITE SERIES
[CHAP. 2
PROBLEMS
1. (a) Show that Parse va Is equality takes the form
1 r 2V 1 oo
- I (/(*) 1 2 <ir- 50 o + Z(<4 + 6 J)
» •/» 2 „_i
when ^»(x) are the trigonometric functions on (0,2r). ( b ) Specialize to sine and cosine
series on (0,ir).
2. It is desired to approximate 1 by
p(x) ** ai sin x 4* as sin 2 j + sin 3x
in such a way that J fl — p(x)] 2 dx is minimum. How should the coefficients a % be
determined? 0
3. Give a direct proof that as n — + oo,
rt* r2*
I f(x) sin nxdx -+ 0, I f(x) cos nx dx — > 0,
J o Jo
if f(x) is periodic of period 2 ic and has a continuous derivative f’(x ). Hint: Integrate
by parts,
4. Obtain the formula a* ** c* from (23-1) by using the fact that dE/dak * 0 at the
minimum value of E .
24* The Pointwise Convergence of Fourier Series. We shall now obtain
an explicit formula for the difference between a function and the nth partial
sum of its (trigonometric) Fourier series. The formula will enable us to
establish the convergence for a class of functions which includes all the
examples given in this book.
If f(x) is a bounded integrable function of period 27r, the nth partial
sum of its Fourier series is
1 n
s n (x) « ~ a 0 + X) ( a k cos kx + b k sin fee), (24-1)
2 k^i
where the coefficients are given by
a k = - f f(t ) cos kt dt, = - [ f(t) sin kt dt. (24-2)
IT 7r — T
Substituting (24-2) into the series (24-1) we get
SBC. 24] ADDITIONAL TOPICS IN FOURIER SERIES
If we define the so-called Dirichlet kernel by
3 n
D n (u) ** - + £ cos ku,
2 hum l
the foregoing result takes the simpler form
*»(*) = - f f(f)D n {i - x) dt.
7T — W
Setting t — x ~ u in (24-4) yields
1 fit —X
s n (x) = - / /(/ + u)D n {u) du.
7T
205
(24-3)
(24-4)
(24-5)
Now, 2> n (u) has period 2ir by inspection of (24-3), and /(r) also has period
2 7T. Hence, the integral of /(u + x)D„(u) over any interval of length 2w is
the same as the integral over any other interval of length 2n, and (24-5)
may be replaced by
1 fT
s n (x) « - / f{x + u)I) n (u) du . (24-0)
71- ■/ ~ir
Since D n (~~u) = D n (a) by (24-3), we may replace u by —a in (24-6) to
obtain the alternative form
1 f*
«»Cr) = - / /(* — u)D n (u) du. v 24-7)
7T 7T
The sum of (24-6) and (21-7) yields
2* n (-r) * - f (/(-r + a) + /(-r - i/) 1A» du.
7T ^ — Jr
Since the integrand is an even function of u, the integral from 0 to w is half
the integral from — t to w, and we have thus established that
t f*
« n (x) ® ~ / lf(x + u) + fix - u)]D n (u) du. (24-8)
7 r J()
To introduce /(x) into our considerations, we observe that
~ « - fD n (u) du, (24-9)
2 7T "'C
since the terms involving cos ku in (24-3) integrate to zero. If (24-9) is
multiplied by 2 f(x) (which is constant with respect to the integration var-
iable u), we get
f{x) = - f 2 f(x)D n (u) du .
7T •'0
(24-10)
20(5 INFINITE SERIES [CHAP. 2
Subtracting (24-10) from (24-8) gives the fundamental formula
s„(x) - f(j) - 1 jT; [/(x + M) - 2/(x) + /(* - u)}D„(u) du, (24-11)
which will now be used to study the convergence of s n (x) to f(x).
We shall say that f(x) is 'piecewise smooth if the graph of /(.r) consists of
a finite number of curves on each of which f'{x) exists. We suppose also
that the derivative exists at the end points of these curves, in the sense
f(x + u) - f(x+)
lim —
u — » 0 + u
or
/(X - u) - fi x-)
— u
(24-12)
where “u — » 0+” means u —> 0 through positive values. Such a function
may have finitely many discontinuities. However, since the Fourier co-
efficients of f(x) are not altered if f{x) is redefined at a finite number of
points, we can assume that
/to =
/(.r-f )+/(*-)
2
(21-13)
at every point x, whether /(x) is continuous at x or not.
These preliminaries lead to the following theorem:
Theorem. If f(x) ts periodic of period 2ir, is piecewise smooth , and is
defined at points of discontinuity by (24-13), then the Fourier series for /(. r)
converges to f(x) at every valm of x.
To establish this theorem we recall that the series (24-3) was summed in
Sec, 17, Example 1. The result (17-5) yields
D n (u)
sin (n +
2 sin
(21-11)
If we substitute this into (24-11) and replace 2/(.r) by f(x+) + /(•** — ) in
accordance with (24-13), we get
«»(*) - f(x)
i r /(* + u ) - /(*+ )+/(*-«)- f(x~) .
= -/
IT J 0
sin
2 sin
Now, the expression
fix + U ) - f(xj-) _ f(x + u) - /(x+) (m/2)
2 sin w
has a limit as u — > 0+, since
u du.
sin (r/2)
(24-15)
uj 2
lim
1
sin (r/2)
as w 0
SEC. 25] ADDITIONAL TOPICS IN FOURIER SERIES 207
and since the limits (24-12) exist by hypothesis. If we define the value of
(24-15) at u * 0 to be this limit, then the expression is continuous for
u > 0 as long as the points
[x,/(x+)] and [x + u , f(x + u)]
are on the same smooth curve belonging to the graph of /(x).
On the other hand, for u ^ 0 the function (24-15) is just as well behaved
as the numerator f(x + u) — f(x+), since sin y 2 u does not vanish. This
shows that the graph of (24-15) consists of a finite number of continuous
curves, which have finite limits as one approaches their end points and
hence are bounded.
It follows from Corollary 3 of the preceding section 1 that
f(x + u) - f(x+)
2 sin y 2 u
udu ~ 0 .
In just the same way it is found that
lim
n — + « o
I
- u)
2 sin y 2 u
■/(*->
— sm
in u du = 0
and hence tin 1 integral representing s n (x) — f(x) tends to zero as n
This shows that
lim s n (x) = /( x)
n --*»«•
00 .
and completes the proof of the theorem.
26. The Integration and Differentiation of Fourier Series. If f(x) is
piecewise continuous 2 on [ — 7 r ? 7 r], then the function
m -[mat ( 25 - 1 )
is continuous and piecewise smooth (Sec. 24). Moreover, F(x) remains
continuous when defined to have period 2*, provided F(~ tt) = F(tt).
Since F(~-w) = 0, the latter condition reduces to
F(t) = dt = jtOo = 0 (25-2)
where l / 2 a 0 is the first Fourier coefficient of /(x). Applying the theorem of
1 The presence of the Vi in sin (n + M)u causes no trouble, since
sin (n 4* 1 <i)u — sin nu cos x /%u 4* cos nu sin
and Corollary 3 applies to each term.
*This means that the interval [-tt,*] ran be divided by points xi, x% . . x n into a
finite number of intervals on each of which f(x) is continuous. Also f(x) must have a
limit as x — ► £*4- and as £ — *
208 INFINITE SERIES [CHAP. 2
the preceding section, we can now deduce that the Fourier series for the
periodic function F(x) converges to F(x) at every value of x.
This result can be obtained when f(x) and \f(x) ) are only assumed integrable, without
being piecewise continuous. Indeed, we can always write f(x) *» P(x) — N(x) where
P(x) m positive and — N(r) is negative. The equation
F(x) « r P(t) dt - r N(t) dt
J — r J -~r
expresses F(x) as the difference of two increasing continuous functions. Since such
functions satisfy the Dirichlet conditions, the desired result can be deduced from Diri-
chlet’s theorem as quoted in Sec. 18.
We shall show next that the Fourier series for F(x) is obtained by inte-
grating the series for fix). If n > 1, the Fourier cosine coefficient A n of
F(x ) satisfies
r* sin i
TrA n — / F(x) cos nx dx = F(x) - —
J ~* n
when we integrate by parts. Since F( — t) = F(tt) * 0, the integrated
part drops out, and since F'(x) — fix), the expression becomes
1 rr b n
jrA n = / sin nx fix) dx = —w —
n J —* n
In the same way B n — a n /n , and also
1 f*
A 0 ~ I xf(x) dx. (25-3)
v *
rr sin nx
- / F’(x) dx
J —t n
These considerations establish the following remarkable theorem:
Theorem I. Let fix) be a function of period 2w which has a Fourier series
2(a n cos nx + b n sin nx). (25-4)
Then, with A 0 given by (25-3),
f m
J — r
dt
1
; -4o + 2
/^n
\n
sm nx cos nx
n
)■
(25-5)
and this equation holds for all x , even if the Fourier series (25-4) does not
converge. Moreover , the series (25-5) is actually the Fourier series of the
function on the left.
In case a 0 ^ 0, so that the Fourier series for fix) is
^a 0 + 2(a* cos nx + b n sin nx),
we apply Theorem I to fix) — 3^o 0 . Inasmuch as
ff{x) dx - [ $ fix) dx - P fix) dx * Ftf) - Fia)
J Ct J * — — y
209
BBC. 26] ADDITIONAL TOPICS IN FOURIER SEHIBS
lor all a and /S, the reader may deduce, by Theorem I, that
fP rP rP
I f(x) dx = I (J^Oo) dx + 2 / (a n cos nx -f b n sin nx ) dx. (25-6)
Ja Ja J<x
This result may be summarized as follows:
Theorem II. Any Fourier series {whether convergent or not) can be in-
tegrated term by term between any limits . The integrated series converges to
the integral of the 'periodic function corresponding to the original series.
For example, according to (18-5) the Fourier series for ]/%% is
*r x = sin x
2
sin 2x sin 3x
j
(26-7)
If we integrate from a to x by Theorem II, we get
1 , * 2n A , cos nx — cos na
~(x i - a 2 - Z(-l) n j
4 n- 1 n 2
•** - C + £( -1)"
Treating a as constant, we see that
1
4 n"^i n*
where Cis constant. Since C is the first Fourier coefficient of
r 1
2ir ,
Alternatively, because do * Om (25-7), we can use (25-3) to obtain
and hence, by (25-5),
F{x)
l r* l
2k L, 4
(25-7), we
x 2 dx
K
u
l
2 )
x 2 dx
(25^)
(25-9)
4 ' 6 n-i n*
The consistency of this result with (25-8) and (25-9) is easily verified.
Although Fourier series can always be integrated, as we have just seen,
the differentiation of Fourier series requires caution. For example, the
series (25-7) converges for all x, and yet the series
cos x — cos 2x + cos 3x — cos 4x H
obtained by differentiating (25-7) diverges for all x. The trouble is that
the function x /ix (when made periodic) has no derivative at the points =tw,
=fc3x, dt 5ir,
This example is quite typical of the general situation, which can be de-
scribed as follows: There is not much hope of being able to differentiate a
Fourier series , unless the periodic function generating the series has a deriva-
tive at every value of x. On the other hand, when this condition is fulfilled,
we usually can differentiate, as is shown by the following theorem;
210
INFINITE SERIES
[CHAP. 2
Theorem III. Let f(x) have period 2r, and suppose f'(x) exists for every
value of x, without exception. If f(x) is continuous, 1 the Fourier series for
f(x) can be obtained by differentiating the Fourier series for f(x). If f{x)
is continuous and has only a finite number of maxima and minima on [~ 7 r, 7 r],
the differentiated scries actually converges to f(x) for every x.
Repeated application of the theorem gives the corresponding result for
higher derivatives. For instance, the series for/"(x) can be found by dif-
ferentiating the series for f(x) twice, provided /"(x) satisfies the conditions
of the theorem.
We shall establish Theorem III by applying Theorem I to the function
f(x). Being continuous, jf'(x) has a Fourier series, and the constant term
a 0 can be found from
TOO - f /'(*) dx « /Or) - /(-T) - 0.
J — 7T
Thus, the series for /'(x) has the form (25-4), namely,
2(a n cos nx + b n sin nx). (25-10)
It follows from Theorem I that the Fourier series for the function
f f'(t)dt~f(x) -/(-*)
J — T
has the form (25-5), and hence the series for f(x) is
/( — ir) + - Aq + 2) sin nx cos nxY (25-11)
2 \n n /
By inspection, we see that differentiating (25-11) gives (25-10). In other
words, the Fourier series for f'(x) can be found by differentiating the series
for /(x), and this is the main assertion in Theorem III.
Since the differentiated series is a Fourier series, its convergence can be
tested by the usual methods. In particular, if /'(x) satisfies the Dirichlet
conditions and is continuous, then the Fourier series for f'(x) converges to
/'(x). Thus, Theorem III is established.
The foregoing methods lead to some important inequalities for the
Fourier coefficients. When a function f(x) satisfies the Dirichlet condi-
tions, it can be shown 2 that the Fourier coefficients have the order of mag-
nitude \/n. That is, there is a constant M depending on f{x) but not on n
1 It can be shown that if/'(x) satisfies the conditions of Dirichlet, then f'{x) is neces-
sarily continuous. This follows from Darboux’s theorem. See, for example, L. Brand,
“Advanced Calculus,” p. 112, John Wiley k Sons, Inc., New York, 1955.
*§ee I. 8. Sokolnikoff, “Advanced Calculus,” p. 406, McGraw-Hill Book Company,
Inc., New York, 1939. Cf, also Prob. 4.
sbc. 25 ]
such that
ADDITIONAL TOPICS IN FOURIER SERIES
211
\a n
\K\
( 25 - 12 )
Now, if the Fourier coefficients of f'(r) in (25-10) satisfy these conditions,
then (25-11) shows that the coefficients of f(x) are bounded by M/ri‘
More generally, we can start with f (k) (x) and integrate k times. The con-
stants of integration drop out as in the derivation of (25-10), and we
obtain :
Theorem IV. Let f(x) have period 2 tt and suppose the kth derivative of
f(x) satisfies the conditions of Dirichlet on [ — ir,ir]. Then the Fourier coeffi-
cients of f{x) satisfy the inequalities
M
\b n \ <
M
IkTi'
where the constant M depends on f(x) but not on n.
PROBLEMS
1. By integrating the series (25-8) from 0 to x deduce that
sin nx
r{x 2 ■— tc~) « 12 £ (-I)*-
»«*i n°
2. By integrating the series m Prob. 1 from to x deduce that
1
48
( T 2 - * 2 ) 2
90
£(-!)“-
3 . Show that the following is not a Fourier series:
^ sin nx
n w 1 log (1 + n)
Hint If it is a Fourier series, the integrated series must converge for all x.
4 . Deduce (25-12) when /Or) is piecewise smooth on [ — Hint: Let the points
Xk divide [ — ;r,7r] into a finite number of intervals on each of which /'(x) is continuous.
The Fourier coefficients are obtained by adding integrals of the type
f k J> 1 f(r) cos nx dx or f +1 /W sin nx dx t
Jx k Jx k
and these can be integrated by parts.
CHAPTER 3
FUNCTIONS OF SEVERAL VARIABLES
The Technique of Differentiation
1. Basic Notions 217
2 Partial Derivatives 219
3 Total Differentials 223
4. Chain Rule 228
5. Differentiation of Composite and Implicit Functions 230
6. Higher Derivatives of Implicit Functions 235
7. Change of Variables 237
Applications of Differentiation
8. Directional Derivatives 243
9. Maxima and Minima of Functions of Several Variables 246
10 Constrained Maxima and Minima 249
11. Lagrange Multipliers 254
12. Taylor’s Formula for Functions of Several Variables 257
Integrals with Several Variables
13. Differentiation under the Integral Sign 261
14. The Calculus of Variations 264
15. Variational Jboblems with Constraints 269
10. Change of Variables in Multiple Integrals 270
17. Surface Integrals 277
215
The considerations of the preceding chapters were confined primarily
to functions y = f{x) of a single independent variable x. One does not
have to go far to encounter functional relationships depending on two or
more independent variables. In courses in analytic geometry and calculus
the reader has learned that a functional relationship of the form z = f(x,y)
may be represented as a surface, and lie has made use of partial derivatives
to study some properties of surfaces. In this chapter the familiar concepts
underlying the study of real functions of two variables are sharpened and
extended to functions of many variables The bearing of such extensions
on the calculation of rates of change and maximum and minimum values
of functions of several variables is indicated in numerous problems of
practical interest.
The concluding sections of the chapter deal with integrals of functions
of several variables They contain an introduction to the calculus of
variations - a subject of great importance in physics and technology.
Many situations can be characterized by statements to the effect that
certain integrals attain extreme values. The determination of such ex-
tremes is in the province of calculus of variations.
THE TECHNIQUE OF DIFFERENTIATION
1. Basic Notions. Let z ~ f{x,y) be a real-valued function of two inde-
pendent variables (x,y). We can think of (x,y) as the coordinates of a
point in the xy plane and interpret z as the height of the surface defined
by z = f(x,y). The function f(x,y) may be determined for every point
(; x,y ) in the xy plane, or the points for which it is determined may occupy
a certain region R in that plane.
For example,
z — x 2 y 2 (1-1)
represents the paraboloid of revolution for every pair of values (x,y) } while
Z = (1-2)
217
218
FUNCTIONS OF SEVERAL VARIABLES
[CHAP. 3
represents the surface of the hemisphere only for those values of (x,y)
for which x 2 -f y 2 < 1. In the example (1-1) the region R is the entire
xy plane, while in (1-2) it is the interior and the boundary of the unit
circle x 2 + y 2 » 1 . The function
1
(1-3)
is defined in the circular region x 2 + y 2 < 1, but not on the boundary
x 2 + y 2 = 1. If the region of definition of the function includes its
boundary C, we sliall say that the function is defined in the closed region R .
When the boundary of the region R is not included, the region is said to
be open .
To define the continuity of z ~ f(x,y) at a given point we need the
notion of the neighborhood of that point. The neighborhood of P(x Q ,y 0 )
is the set of all points P(x,y) interior to a circle with center at (x 0) y 0 ).
If the radius of this circle is 8 > 0, then the neighborhood of (xodA>) is
a circular region (.r — x 0 ) 2 + (y — ijo) 2 < 6 2 . The positive number 5
can be chosen arbitrarily small. The extension of this definition to spares
of more than two dimensions is immediate. The neighborhood of
P(xo,yoZo) is the open spherical region
(x - x 0 ) 2 + (y - ?/o) 2 + (z - zo) 2 < 5 2 .
The neighborhood of the point Po{xo f yo } ZoJo) in the space of four variables
x t y t z,t is the set of “points” (x } y,z>t) such that
( x — To) 2 + (y — y«) 2 + (z — 2 0 ) 2 + (t — to) 2 < $ 2 ,
and so on for spaces of higher dimensionality.
Intuitively the notion of continuity of z = /Or,y) at a given point
PoixoiVo) means that the value of f{x,y) throughout a neighborhood of
(xo t yo) will differ from f(x 0t y 0 ) by as little as desired if the neighborhood is
chosen sufficiently small. In symbols this means that if one specifies a
positive number e, no matter how small, then for all points in a certain
circular region (x — x 0 ) 2 + (y — Vo) 2 < we have
I f(x,y) - f(x 0 ,y 0 ) | < €. (1-4)
An alternative notation for (1-4) is
lim f(x,y) = f(x 0 ,y 0 ), (1-5)
(x,v) — (ZqI/q)
which states that as the point (x,y) is made to approach (x 0 ,y 0 ), the value
of the limit is equal to the value of the function at (xo,j/o)-
We extend this definition (1-5) to functions f(x u x 2 , . . .,x„) of n variables
in the obvious way: A function /(x,,x 2) - • -,x») is continuous at the point
219
SEC. 2] THE TECHNIQUE OP DIFFERENTIATION
P 0 (xi,X2,.. .,x”) whenever
lim f(x u r 2 ,...,x n ) = /(*?,*?,.. .,x° n ).
The “point P” here means the set of n real numbers (x!,x 2 , . . . ,x n ). Clearly,
/(: r 1 ,^ 2 , * • * ,* r n) cannot be continuous at (x 1 ,x 2> • . - ,x n ) if it is not defined at
that point.
Whenever /(x lr r 2 ,. . .,j n ) is continuous at every point P of the given
region /?, it is said to be continuous in the region R. Functions with which
we shall deal for the most part will be continuous in some region, open or
closed.
PROBLEM
Describe the regions of definition and the surfaces defined by the following functions z:
(a) x — y 4- 2 * 1; (b) z » y;
(r) y 2 + z 2 * 25; (d) z « 1 /(.r 2 + y 2 ) ;
(e) z * \/x\ (/) z « Vi - (x - l) 2 - ?/ 2 .
2. Partial Derivatives. Let u — f(x,y) be a function of two independent
variables x, y , and let it be defined at a point (x<,,i/o) and for all values of
(x t y) in some neighborhood of (x 0 ,t/o). If f/ is set equal to y<), then u be-
comes a function of one variable x, namely,
u = /(x,y 0 )*
If this function has a derivative with respect to x, the derivative is
called the partial derivative of f(x,y) with respect to x for y « ?/ n . In like
manner, if x is assigned a constant value x.), the derivative with respect
to y of the resulting function f(x () ,y) is called the partial derivative of f(x,y)
with respect to y for x = x 0 . The customary notations for the partial
derivative of u = /(x,y) with respect to x arc
du df
* 'U'Xf fxy and *
dx dx
The partial derivatives of a function /(x i,x 2 ,...,x n ) of n independent
variables are obtained by fixing in it the values of n - 1 variables and
computing the derivative of the resulting function of a single variable.
Thus,
SO,y) = yx* - 2 yx (2-1)
has the partial derivatives
220
FUNCTIONS OF SEVERAL VARIABLES
[chap. 3
If « « f(x,y) is a function of two independent variables, it is easy to
provide a simple geometric interpretation of partial derivatives u x and u v .
The equation u — f(x,y) is the equation of a surface (see Fig. 1). If x
Fio. 1
is given a fixed value x 0 , u — f(x 0l y) is the equation of the curve AB on
the surface formed by the intersection of the surface and the plane x — .r 0 .
Then
u = = j fj/ o, Vn + Ay) ~ /(t q,.Vo)
V dy Ay -» o Ay
is the slope at any point of AB. Similarly, if y is assigned the constant
value y Qt then u « /(x,?/o) is the equation of the curve CD on the surface
and
du f (* o + At, ?y«) - f(io,y 0 )
dx ax -» o Ax
is the slope at any point of CD.
In Chap. 5 we shall see that the partial derivatives u x> u u> u x of u =
f(x,y,z) can be interpreted as rectangular components of a certain vector,
called the gradient of u. This vector provides a measure of the space
rate of change of u.
The partial derivatives f Xv f Xv . . f z% of f(x u x 2y ,. .,x n ) are functions
of X 1; x 2 , . * x n; and they may have partial derivatives with respect to
some or all of these variables. These derivatives are called second partial
derivatives of /(x i,x 2 , . . . ,x n ). If there are only two independent variables,
f(x,y) may have the second partial derivatives
dx \dx) dx 2
dx \dy/ dx dy
THE TECHNIQUE OF DIFFERENTIATION
221
BEC. 2)
It should be noted that f xv means that df/dx is first found and then
d/dy(df/d x ) is determined, so that the subscripts indicate the order in
which the derivatives are computed. In
dy dz dy \dx/
the order is in keeping with the meaning of the symbol, so that the order
appears as the reverse of the order iti which Ihe derivatives are taken.
For the function f(x,y) in (2-1) we get, on noting (2-2),
d / df\ d
f*y ~ ~ ” ( " ~) ~ ™ (2*r// — 2 y) ~ 2x — 2,
dy \dx/ dy
d / a f\ a n
fyr =—(--) = — (a- 2 - 2x) = 2r - 2
dx \dy/ dx
fyy ” ~~ (x* 2.r) * 0,
dy
d
fxx - 7- (2 ry ~ 2 y) = 2 y.
dx
In this example f xy — f yt , and indeed, one rarely meets functions for which
the so-called mixed derivatives are unequal. In fact one can prove i that
ff ~
dx dy dy dx
whenever these derivatives are continuous at the point in question.
The process of defining partial derivatives of higher orders is obvious
from the foregoing, and it is possible to establish equalities such as f xyx
- /try ” fysx and f yxy = f xyv « fyy, whenever these derivatives are con-
tinuous at the point in question.
We note in conclusion that although the notation du/dx u>r the partial
derivative v x suggests a quotient of some quantities analogous to f lie dif-
ferentials dy and dx in the notation dy /dx for the derivative of y = f(x),
no such interpretation is available for partial derivatives. To stress the
point that du/dx should never be thought of as a fraction, we give an
example.
E ram pie. Consider the equation for an ideal gas pv * RT y whole p is the pressure,
v is the volume, T is the absolute temperature, and A' is a physical constant. It should
be noted first that the concept of partial derivatives hinges on the agi cement as to which
variables in a given functional relationship are assumed to be independent. Thus, if
‘See I. S. Sokolnikoff, “Advanced Calculus/' sec. 31, McGraw-Hill Book Company,
Inc., New York, 1939.
222 FUNCTIONS OF SEVERAL VARIABLES
we solve our gas equation for p, we obtain
[chap* 3
We can then compute
dp RT
dv " v 2
dp R
~dT v
On the other hand, if we solve for v t we get
in which p and T are now regarded as the independent variables, and we can, therefore,
COmpUte dv R dv RT
ar “ p’ ap“ p 2 ' 2-4
We can also solve for T and get
in which p and v are to be considered as the independent variables, so that
dT v dT p
dp R dv R K ;
From Eqs. (2-3) to (2-5) we obtain
dp dv HZ _ ZL ?L v
dv dT dp 1 ? p R '
since pv m RT. But if it were possible to treat the terms in the left-hand member of
(2-6) as fractions, we should have obtained +1.
PROBLEMS
1. Find dz/dx and dz/dy for each of the following functions*
(a) z « y/x; (b) z ** x?y + tan -1 (y/x); ( c ) z - sin xy 4~ j; (d) z « e x log y;
(e) z x 2 y 4“ sin" 1 x
2 . Find dit/dx , du/dy, and du/dz for each of the following functions:
(a) u *» x 2 y 4 - yz — xz 2 ; (b) u *» xyz + log xy;
(c) u *■ z sin" 1 (xfy); ( d ) u « Or 2 4- y 2 4- z 2 )^;
(e) u * (x 2 4- y 2 + z 2 )~ h .
8. Verify that d 2 f/dx dy *»* d 2 fjdy dx for
(a) / ** cos xy 2 , (b) f » sin 2 x cos y , (c) / =* e v/x .
4 . Prove that if
(a) f(x f y) « log (x 2 4- y 2 ) + tan" 1 -» then 4- = 0;
x dx* dy 2
W f(x,V,*) - (x 2 4- V 2 + z 2 )“~ H , then 4* ~ 0.
dx 2 dy 2 dr
SEC. 3] THE TECHNIQUE OF DIFFERENTIATION 223
3. Total Differentials. The differential dy of a function y = f(x) is
defined by the formula
dy = f'(x) dx, (3-1)
where dx s Ax is an arbitrary increment of the independent variable x.
We agree to call an increment of the independent variable x the differential
of x.
Since (Fig. 2)
/(i+Ai)
fix)
JP.
f f Ax
JzJT
\
dy
| x x+Ax x
Fig 2
, v /(x + Ax) — f(x)
f (x) ~ hm ~ - hm ~ — —
Ax — * 0 Ax Ax — * 0 Ax
(3-2)
we can write, on recalling the definition of the limit,
Ay
v = /'<*> + «>
Ax
(3-3)
where lirn « — 0. Hence
Ax 0
Ay ~ /'(x) Ax + t Ax.
The substitution from (3-1) in (3-4) then yields
A?/ - dy + € Ax,
lim € = 0 as Ax — » 0 .
(3-4)
(3-5)
Figure 2 illustrates geometrically the relations between Ay, dy, and dx,
and formula (3-5) shows that for small values of Ax, the increment Ay
is a good approximation to the differential dy in the sense that
Ay - dy
Ax
(3-6)
where e — ► 0 as Ax — ► 0.
One can construct a similar approximation to the increment A u for
the function u ~ /(x,y) when x and y are allowed to acquire the respective
increments Ax and Ay.
224
FUNCTIONS OF SEVERAL VARIABLES
[CHAP. 3
The presentation of the essential ideas in this construction is greatly
simplified by the use of the mean-value theorem of the differential calculus.
This theorem states that whenever fix) has the derivative fix) at every
point of the interval (x, x + Ax), then
fix + Ax) — f{x)
1 — 7 — — - m (3-7)
Ax
where £ is an intermediate point in the interval. The geometric meaning
of this theorem is exceedingly simple. Formula (3-7) states that the slo|>e
fjx + Ax) - fjx)
Ax
of the secant line AB (Fig. 3) is equal to the slope /'(£) of the tangent
line CD to the curve y = fix) at an intermediate point £ in the interval.
Since £ =* x + 0 Ax, where 0 Ax is some fraction of the length Ax, we
can write (3-7) as
fix + Ax) - fix) « /'(x + 0 Ax) Ax, 0 < 0 < 1. (3-8)
Consider now a function u = f(x } y) of two variables. The increment
A u that results from replacing x by x + Ax and y by y + Ay is
A u fix + Ax, y + Ay) - f{x,y), (3-9)
If we add and subtract fix, y + Ay) in the right-hand member of (3-9),
we obtain
Aw ~ [/(x + Ax, y + Ay) - /(x, y + Ay)] + [fix, y + Ay) ~~ fx,y)l
(3-10)
THE TECHNIQUE OF DIFFERENTIATION
SEC* 3] THE TECHNIQUE OF DIFFERENTIATION 225
The expression in the first pair of brackets in (3-10) is the increment in
the function f(x, y) when the second variable in it has a fixed value y + Ay,
Accordingly, we can apply formula (3-8) to it and write
f(x + A x,y+ Ay) - f(x, y + Ay) = f x (x + 0i Ax, y + Ay) Ax, (3-11)
where 0 < 0* < 1.
Similarly, the application of (3-8) to the expression in the second set
of brackets in (3-10), in which x has a fixed value, yields
f(x, y + Ay) - f(x,y) = f v (x, y + 0 2 Ay) Ay,
(3-12)
where 0 < d 2 < 1.
Now if the partial derivatives / X (x,y) and/ v (ay/) are continuous functions,
then
f z (x + 0iAx,y± Ay) = fjx,y) + e h
, (3-13)
/*(■*, y + 02 Ay) « f v U,y) + « 2 ,
where lim = 0 and Jim « 2 = 0 as Ax and Ay approach zero. Hence we
can write (3-11) and (3-12) in the forms
}{x + Ax, y -f Ay) - /(x, y + Ay) = \f x {x,y) + €l ] Ax,
/(x, y + Ay) - /(x,y) * [f v (x,y) + t 2 ] Ay,
so that (3-10) becomes
Aw = f x (x,y) Ax + f v (x,y) Ay + t x Ax + e 2 Ay. (3-14)
If we define the differential du of u = /(x,y) by the formula
du - f x Ax + f v Ay (3-15)
df df
as — Ax H Ay
dx dy
we can write (3-14) in a form analogous to (3-5):
A u — du + €i Ax + *2 Ay,
lim e x = 0, lim e 2 — 0 as Ax —+ 0 and Ay 0.
Formula (3-10) shows that when the increments Ax and Ay are small, the
differential du is a good approximation to A u in the sense that
(3-16)
Aw — du
ci Ax + e 2 A y
Vl&z? + (&V) 2 V (Ax ) 2 +W 2
o
as Ax and Ay approach zero.
As in the case of functions of one independent variable, we agree to
226 FUNCTION’S OF SEVERAL VARIABLES [CHAP. 3
write the increments Ax and Ay in the independent variables as dx and dy,
respectively. Then (3-15) reads
df df
du ~ — dx + — dy, (3-17)
dx dy
Whenever (3-16) holds, the function u = f(x;y) is said to be differentiable ,
and du in (3-17) is called the total differential 1 A function which is dif-
ferentiable at each point of a region is said to be differentiable in the region .
The foregoing discussion shows that a function f(x,y) is differentiable
whenever the partial derivatives f x and f v are continuous.
The foregoing considerations can be extended to functions u = f(x i,x 2 ,
. . . t x n ) of n independent variables. The total differential du is given
by the formula
df df df
du — — dx i H dx 2 H 1 dx n (3-18)
dx i dx 2 dx n
whenever the partial derivatives f Xx are continuous functions.
It should be noted that the total differential du is equal to the sum of
n terms involving independent increments dx t When a number of small
changes are taking place simultaneously in a system, each one proceeds
as if it were independent of the others, and the total change is the sum of
the effects due to the independent changes. Physically, this corresponds
to the principle of superposition of effects.
Example 1. Find the total differential of u « e s yz 2 . Since u Xt Uy , u x are obviously
continuous functions, formula (3-18) yields
du « e*yz L dx -f e 1 * 2 dy F 2e J yz dz.
Example 2. A metal box without a top has inside dimensions 6 by 4 by 2 ft. If the
metal is 0.1 ft thick, find the actual volume of the metal used and compare it with the
approximate volume found by using the differential.
The actual volume is A V , where
AF * 0.2 X 4.2 X 2.1 - 6 X 4 X 2 - 54.084 ~ 48 « 6.684 ft*.
Since V ** xyz t w here x « 6, y » 4, z *» 2,
dV ■* yz dx -f xz dy -f xy dz
* 8(0.2) -f- 12(0.2) -f 24(0.1) « 6.4 ft*.
1 In Chap. 5 we shall encounter expressions of the form (3-17) in which f x and f y be-
come discontinuous at certain points of the region and u is a multiple- valued function.
Such expressions are generally called exact differentials, and they are also denoted by the
same symbol du. For technical reasons, explained in Chap. 5, it is usually necessary to
assume the continuity of f x and f y , in which event the terms exact and total differentials
become synonymous. A geometric meaning of the differential (3-17) is given in Sec.
10, Chap. 4.
THE TECHNIQUE OF DIFFERENTIATION
SEC. 3]
227
Example 3. Two sides of a triangular piece of land (Fig. 4) are measured as 100 and
125 ft, and the included angle is measured as 60°. If the possible errors are 0.2 ft in
measuring the sides and 1 0 in measuring the angle, what is the approximate error in the
area?
Fro. 4
Since A * Yixy sin a,
dA =* l / 2 {y sin a dx -f x sin a dy -f xy cos a da) t
and the approximate error is therefore
/T A
dA ~l \ 125 (- 2 “) (°-2) + 100 (' 2 ~~) (°-2) + 100(125) Q T ^] - 74.0 ft*.
PROBLEMS
1. A dosed cylindrical tank is 4 ft high and 2 ft in diameter (inside dimensions).
What is the approximate amount of metal in the wall and the ends of the tank if they
are 0.2 in. thick?
2 . The angle of elevation of the top of a tower is found to be 30 °, with a possible error
of 0.5°. The distance to the base of the tower is found to be 1,000 ft, with a possible
error of 0.1 ft. What is the possible error in the height of the tower as computed from
these measurements?
3 . What is the possible error in the length of the hyjiotonuse of a right triangle if the
legs are found to be 11 5 and 7.8 ft, with a possible error of 0.1 ft in each measurement?
4. The constant C m Boyle’s law pv = C is calculated from the measurements of p
and v. If p is found to be 5,000 ib per ft 2 with a possible error of 1 per cent and v is
found to be 15 ft 3 with a possible error of 2 per cent, find the approximate possibleerror in
C computed from these measurements.
6. The volume v , pressure p, and absolute temperature T of a perfect gas are con-
nected by the formula pv * R7\ where R is a constant. If 7" » 500°, p «= 4,000 lb per
ft 2 , and v *» 15 2 ft 3 , find the approximate change in p when T changes to 503° and v
to 15.25 ft 3 .
6. In estimating the cost of a pile of bricks measured as 0 by 50 by 4 ft, the tape is
stretched 1 per cent beyond the estimated length If the count is 12 bricks to 1 ft 8 and
bricks cost $20 per thousand, find the approximate error in cost.
7 . In determining specific gravity by the formula ,«? = A /(A — W) f where A is the
weight in air and W is the weight in water, A can be read within 0.01 lb and IF within
0,02 lb. Find approximately the maximum error in s if the readings are A » 1.1 lb
and W ** 0.6 lb. Find the maximum relative error As/s.
8. The equation of a perfect gas is pv » RT. At a certain instant a given amount of
gas has a volume of 16 ft 3 and is under a pressure of 36 psi, Assuming R * 10.71, find
the temperature T. If the volume is increasing at the rate of % efs and the pressure is
decreasing at the rate H psi per sec, find the rate at which the temperature is changing.
228 FUNCTIONS OF SEVERAL VARIABLES
9. The period of a simple pendulum with small oscillations is
[chap. 3
If T is computed using l 8 ft and g — 32 ft per sec per sec, find the approximate
error in T if the true values are l * 8.05 ft and g ** 32.01 ft per sec per sec. Find also
the percentage error.
3.0. The diameter and altitude of a can in the shape of a right circular cylinder are
measured as 4 and 6 m., respectively. The possible error in each measurement is 0.1 in.
Find approximately the maximum possible error in the values computed for the volume
and the lateral surface.
11. We define an approximate relative error ( m the differentiable function / by the
formula e ** df If. Show that the approximate relative error of the product is equal to
the sum of the approximate relative errors of the factors. Hint: e * d log/.
4. Chain Rule. Let w = f(x,y) be a function of the variables x and y
which, in turn, are functions of some independent variable If t is given
an increment A f, the functions x and y will acquire increments Ax and Ay>
and consequently u will receive an increment Au.
Assuming that u ~ f(x,y) is continuous together with its partial deriva-
tives, one can write [see (3-14)]
du du
Au = — Ax H Ay 4- Ax + *2 A?/.
dx dy
Dividing both sides of this expression by At gives
Au du Ax du Ay Ax Ay
as 1 b «1 * 4“ *2 *
At dx At dy At At At
(4-1)
Now if it is supposed that x and y can be differentiated with respect to l,
the expression (4-1) gives, upon passing to the limit as At —> 0,
du du dx du dy
dt dx dt dy dt
df dx df dy
dx dt dy dt
since ~ » 0 and e 2 — ► 0. The reason for the vanishing of €j and « 2 as
At 0 is as follow’s. Since x and y are assumed to be differentiable func-
tions of t , the identities
Ax Ay
Ax *» — At, Ay * — At,
At At
show that Ax ~ * 0 and Ay — * 0 as At — ► 0. Rut when A# — ► 0 and
Ay 0 we know that n and <? 2 approach zero by (3-16).
THE TECHNIQUE OF DIFFERENTIATION
229
SEC. 4]
Formula (4-2) gives the rule for the differentiation of composite functions.
It is clear that if u is a function of a set of variables, x x , a* 2 , . . . , x n , where
each variable is a differentiable function of an independent variable t 1
the derivative of u with respect to t is given by the chain rule:
du du dx i du dx 2 Ou dx n
— 1 1 1
dt dxi dt dx 2 dt dx n dt
A special case of formula (4-2) is of interest. If it is assumed that
t = O', (4-2) becomes
du du du du
T = -T + T ~T ' ( 4_4 )
dx dx dy dx
Formula (4-4) can be used to calculate the derivative of the implicit
function given by
fU*V) - 0. (4-5)
Let it be assumed that (4-5) can be solved tor y to yield a real solution
y = <p(x); (4-6)
then the substitution of (4-6) in the left-hand member of (4-5) gives an
identity
0 = f(x t y) 9 where y = <p{x). (4-7)
Applying (4-4) to f 1-7) gives
and solving for dy/dx ,
df df dy
0 - - --
dx dy dx
The formula (4-8) assumes that df/ dy does not vanish for the point
(x 0) yo) at which the derivative is calculated.
Example 1. ljetf{x,y) « 3jr 8 y 2 4- * cos y *» 0; then
9 x l y l 4* cos y,
i\x*y — x sin y
dy 9x~id -f cos y
so that r- ** •" ;r -f
dx bx 3 y - x sin y
for all values of x and y that satisfy the equation
3 x^y 2 4- x cos y * 0
and for which 6 x z y ~~ x siri y 0 .
Example 2. Let x 2 4- y 2 - 0; then df/dx - 2x , df/dy - 2 y. But it does not follow
that
dy x
230
FUNCTIONS OF SEVERAL VARIABLES
{CHAP, 3
This result is absurd inasmuch as the only real values of x and y that satisfy x 2 •+- y* ■* 0
are x ** 0 and y » 0. Since df/dy vanishes at this point, the formal procedure used in
obtaining dy/dx m meaningless.
Example 3, Let f(x } y) * 0 represent the locus of a curve, and let P(xo,yo) be a point
on the curve. The equation of the tangent line to the curve at the point P is
V - l/o
(x - a?o),
x-xo
It follows from (4-8) that this equation can be written in the form
fx(x o,yo)(x - xq) +f v (xo t y Q )(y - y 0 ) * 0.
PROBLEMS
1- Find the equation of the tangent line to the ellipse
at the point (xo,s/o).
x y M
a 2 ^ l > 2
1
2 . Find the equation of the tangent line to the folium of Descartes
x 3 -f V s — 3a xy ** 0.
Note particularly the behavior of the tangent line to the folium at (0,0).
3 . Find du/di if
\x » e* +
tan“
and
4 . Find the equation of the tangent line to the ellipse
x =* a cos 0,
y ** b bin 0,
at the point where 0 * x/4.
6. (a) Find du/dt, if u = e 1 sin ?/£ and x » f 2 , y « £ — 1, z ® 1 //;
(6) find du/dr and du/dB , if a =* x 2 — 4i/ 2 , x = r sec 0, and y = r tan 0.
6. (a) Find du/dx and du/dx , if u « x 2 -f y 2 and y * tan x;
(6) given F ** f(x,y,z), where x *» r cos 0, i/ ** r sin 0, z ** t; compute dV/dr , dF/d0 f
dV /dt in terms of dF/dx, dV/dy f and dF/dz.
6. Differentiation of Composite and Implicit Functions. The reasoning
employed in the preceding section can he applied in obtaining the total
differential, and hence the derivative, of a function of n variables
u =/(*i, x 2 ,...,x n ),
where Xi =* Xi(t), i ~ 1, 2, . . n
are n differentiable functions of a single variable t. The resulting expression
for the total differential is
SEC* 5] THE TECHNIQUE OF DIFFERENTIATION 231
A question arises concerning the validity of formula (5-1) in the case
where the variables x t are functions of several independent variables
*!, < 2 , t m . Thus, let
u « f(x i,x 2 ,...,x n ) (5-2)
be a function of the n variables r t , where the x t are functions of the vari-
ables t l} t 2 , . tm, say
X t = X t (ti f l 2} , » -fm)) i ^ 1,2, . . » , 71. (5-3)
If all the variables except one, say t kl are held fast, (5-2) becomes a function
of the single variable t k and one can calculate the derivative df/dt k with
the aid of (5-1). The notation df/dtk , instead of df/dt k , is used to signify
the fact that all variables except t k are held fast
Assuming the continuity of the derivatives involved, one can write
df
df dx,
, df dx 2
df dx n
see —
H — + •
at i
dxi dt\
dx 2 dt 1
dx n dt\
df
df dx i
, df dx 2
df dx n
zrx —
+ — h • ■
dt 2
dx ! dl 2
dx 2 dt 2
dx n dt 2
df df dxi df dx 2 df dx n
dim dxi dt m dx 2 dt m dx n dt m
If df j dt , , df/dt 2 , df ' dt m are multiplied, respectively, by dt\, di 2l
. . . , dt m and the resulting expressions added, one obtains
dli
+
+
+
df
d f ,
dt 2 + ■ ■ ■
H dt m
dt 2
dim
df
(dX\
dli
dxi
( - dli
+ - - dt 2 + •
■ * H 1
dXj
\dt\
dl 2
St m
df
(dx 2
, dx 2 , ,
dx 2
(- -dti
_j -j- • •
■ -i 1
Ox 2
\dt i
dt 2
dl m
df
(dx n
dx n
dx n
( — dt ,
-j dt 2 + •
dX n
\di,
dt 2
dtmj
dt^\
dtn
The left-hand member of this expression is the total differential of f{x\ } x 2}
. . .,x n ), regarded as a function of the independent variables t x , t 2 , . . t mt
whereas the terms in the parentheses in the right-hand member are pre-
cisely the total differentials of (5-3). Hence, one can write
df df df
df « — dx i H dx 2 + * • • + ” — dx n ,
dx i dx 2 dx n
232
FUNCTIONS OF SEVERAL VARIABLES
[chap. 3
which shows that formula (5-1) is valid whether the x t -s are the independent
variables or are functions of any set of other independent variables.
The foregoing can be summarized as follows:
Theorem. If u « f(x 1; x 2 , . . . r x«), then
df df df
du = — dx i 4 dx 2 H i dx n ,
dxi dx 2 dx n
regardless of whether the variables x,- are the independent variables or are
functions of other independent variables f*. It is understood that all the
derivatives involved (the df/dxk and dxjdtk) arc continuous functions.
The fact that the total differential of a composite function has the same
form irrespective of whether the variables involved are independent or
not permits one to use the same formulas for calculating differentials as
those established for the functions of a single variable. Thus,
d(u + v) = du + dv }
d(uv) d(uv)
d(uv) = du H dv
du dv
and so forth.
v du + u dv,
Example 1. If u *» xy 4- yz -f zx, x ** l, y e *, and z » cos l,
du (lx K dy dz
^-( ! /+*)- + ( I +z ) - + (x + 3 , ) -
“» («“* 4- cos 00) 4* (t + cos 0( —e" 1 ) 4- 0 4* c “■**)( — sin t)
e~* 4* cos t - fe*~* cos t — t sin * — sin
This example illustrates the fact that this method of computing du/dt is often shorter
than the old method in which the values of r, y, and z in terms of t are substituted in
the expression for u before the derivative is computed.
Example 2. If f{x t y) *■ x 2 4“ V 2 , where x * r cos ? and y *= r sin ?>, then
a/ df dx df dy „ . ,
— „ j » 2.r cos V s + 2y sin <p *■ 2r cos 2 y> 4- 2r sin 2 *» 2r.
Br dx dr By Br
df_ _ afax
dip dx dtp dy dtp
2x(—r sin <p) 4- 2 y{r cos ip)
-2 r 2 cos sin 4* 2r 2 cos sin » 0.
or df ** 2x dx 4- 2y dy.
Also, df a 2 r dr or df » 2x dx 4- 2y dy.
Since f{x,y) * x 1 4* V 2 ** r 2 , these results could have been obtained directly.
Example 3. Ijet z » e**, where i *» log (u 4- ®) and y «* tan~ l ( u/v ). Then,
cto w4»
dy v
du v 2 4 - u l
THE TECHNIQUE OF DIFFERENTIATION
233
SEC. 5]
Hence,
Similarly,
dz
dll
dz dx dz dy
dx du dy du
ye** xe^v
u *+■ v v 1 +
dz ye ** xc**u
dv u -f v v 2 -f u 2
The same results can be obtained by noting that
dz «■ i/e Ty dx -f xe** cty.
, dx dr 1 1
But dx * — au H at? » aw 4- dv
du dv u 4- v u 4- v
and
Hence,
But
dy « — du + Q dv
du
dv
v 2 -i u‘‘
du -
i 2 H u 2
,dv.
du -f ~ dv v du - // dv
dz * ye* v f xr rv 2
w -h f> tr 4" xr
( ye * V X. xe * Vv \ a 4 ( ,,cXV X€ xu u \
\m 4* v if 1 4- uv \a 4 v v 2 4 w 2 /
dv.
_ a^
dz - an \ - dr,
a?i <>c
and since du and dv are independent differentials, equating the coefficients of du and
dv m the two expressions foi dz gives
dz
y ( .rv
- ■ T - 4
u 4- »»
xr TV v
dw
v 2 + 1/2
dz
ye 1 *
xe TV u
dv
u 4 v
v 2 4 ir
Let f(x,y,z) — 0 define any one of the variables as an implicit function
of the remaining ones. If x and y are thought to be the independent
variables and one can obtain a rea* differentiable solution for z in terms
of x and y y it is possible to write
dz dz
dz = — dx H dy.
dx dy
But
df df df
df « — dx H — dy 4 dz
dx dy dz
0 .
Substituting the value of dz in this equation gives
FUNCTIONS OF SEVERAL VARIABLES
234
[chap. 3
Since x and y are independent variables, we get, on setting in turn dx
and dy =* 0,
df df dz
— + - — = 0
dx dz dx
0
and
df df dz
dy dz dy
= 0 .
These equations could have been obtained directly by applying the chain
rule to the equation f(x,y,z ) = 0, in which z is regarded as a function of
x and y, but we wished to illustrate another procedure followed in Sec. 10
and elsewhere. If df/dz ^ 0, these equations give
dz df/dx dz _ df/dy
dx df/dz dy df/dz
(6-4)
The formulas (5-4) permit one to calculate the partial derivatives of
the function z defined implicitly by an equation
f(w) = 0.
As an illustration, let
x 2 + 2?/ - 3 xz +1=0.
Then, by (5-4),
dz 2x — 3 z dz Ay
dx —3* dy —3x
Example 4. A function f(xi,X 2 , • . .,Xn) of n variables ii, xj, . . x n is said to be homo-
geneou ? of degree m if the function is multiplied by \ w when the arguments x h x 2 ,
x n are replaced by Xxi, Xx 2 , . . ., Xx n , respectively. For example, f(x } y) *» x 2 /Vx 2 ~f y*
is homogeneous of degree 1, because the substitution of Xx for x and X y for y yields
Xx 2 /Vx 2 -f V- Again, » 0 /y) -b (log x - log y)/x is homogeneous of degree
—1, whereas f(x,y,z) » z 2 /\/ x 2 + y 2 is homogeneous of degree %.
There is an important theorem, due to PJuIer, concerning homogeneous functions.
Euler’s Theorem. If u ** fix^xi , . . .,x n ) homogeneous of degree m and has con-
tinuous first partial derivatives , then
df df df
Zl h x 2 - 1 b Xn — * mf(xi,Z 2 , . . . ,x»).
dXi vJT 2 TOfi
The proof of the theorem follows at once upon substituting
x[ *• Xxi, xi — Xx 2l . . . , ■* Xx n .
Then, since /(xi,xj,. . .,x») is homogeneous of degree m,
ffeittyit • • *■* \ m f{x i,Xj, . . . ,x n ).
235
BBC. 6] THE TECHNIQUE OP DIFFERENTIATION
Differentiating with respect to X gives
df df df
i^ Xi + ^* 2+ "' + ^ 1 " - mX "“^ x *•*» ■*«>•
If X is set equal to 1, then x\ ■» xx, xa *» xa, . . x n « and the theorem follows.
PROBLEMS
1. (a) Find dy/dx if x sec y -f x 3 y 2 «■ 1;
(6) find d«/dx and dz/dy if x 3 y — sin 2 -f z 8 * 0
2. If / i^a function of w and v, where u « Vx 2 -f y 2 and t> - tan~ A (y/x), find d//dx,
af/ay, V(df/axf+~(df/dy) 2 .
8. If / is a function of u and v , where u ** r cos a and f» « r sin 0, find
f/ df t
dr* 60*
4. If x * x' cos 0 y 1 sin 6, y «* x' sin 9 -f j/' cos prove that
( 3 ,+ ®‘*(£)‘+( 5 )‘
5. Find the total differential if a ■» x 2 -f- y 2 , x » r cos 6, and y » r sin
6. If / « e* 1 ', where x * log (u 2 + ^) h and y * tan"* 1 (u/v), find df/du and df/dv.
7. If z « (n -f e)/(l — we), u « y sin x, and r *» e v>r , find dz/dx and dz/dy .
8. Find dz/dr and dz/ds if z ** (x — y)/(I + xy), x = tan (r — s), and y « c rt .
9. Verify Euler’s theorem for each of the following functions:
(a) f(x t y t z)
(» /(*,y) -
(c) /(x,y) -
(d) f(x,y t z)
» x 2 y -f xy 2 + 2xyz;
V
JL _l IgjLf ~~ log y
y 2 x*
Vx 2 - y 2
(e) f(x,y,z)
(/) /(x,y) -
(y) f(x,y) *
(*) f(x t y) =
- (x 2 + y 2 T z 2 )~ H ;
V jc -f y ^
y
x 2 -f y 2
6. Higher Derivatives of Implicit Functions. The problem of calculating
the derivative of y with respect to x when y is an implicit function of the
independent variable x defined by
f(x,y) * 0 (6-1)
was discussed in Sec. 4. It was shown there that
dy
fx(x y y) + fv(x;y) 7 = 0. (6-2)
ax
Differentiating this equation again and assuming that all the derivatives
involved are continuous functions of x and y give
dy / dy \ 2 d 2 y
f,x(x,y) + 2 f xv {x,y) — +f vv (x,y) + f v (x,y) =
0 . ( 6 - 3 )
236
FUNCTIONS OF SEVERAL VARIABLES
[CHAP. 3
If fy(x,y) 0 at the point where the derivative is desired, (6-3) can
be solved for d 2 y/dx 2 and the value of dy/dx substituted from (6-2). The
result is
A ^ _ fxxfl - VxJJy + fyj'i
dx 2 fl
The process can be continued to obtain the derivatives of higher orders.
A similar procedure can be employed to calculate the partial derivatives
of a function z of two independent variables x and y defined implicitly
by an equation of the form
f(x,y,z) = 0. (6-4)
Differentiating (6-4) with respect to x and y in turn gives
Mx,y,z) + f z (x,y,z)
dz
dx
= 0,
fv(*>y>z) + fz{*,y,z)
dz
dy
o.
(6-5)
If f z (x y y,z) does not vanish for those values of x, y } and z that satisfy
(6-4), then Eqs. (6-5) can be solved for dz/dr and dz/dy. Partial deriva-
tives of higher order can then be obtained by differentiating equations
(«).
Example Let it be required to find the derivatives of second order of the function z
defined implicitly by the equation
x 2 y l z~
h — 4- -
a 2 ^ b 2 ^ c 2
1.
Differentiating this equation with respect to x and y gives
2x 2 z dz
a 2 r* dx
0,
2 y 2 z dz
¥ + r“* oi/
o.
Differentiating the first of Eqs (0-6) with respect to x and y, one obtains
2 2 /dz\ 2 2 zd‘‘z
2 dz dz 2 z dh
~g H — r - *» 0.
c 2 dx dy r dx dy
Solving for dh/dx 1 and dhfdx By and making use of (6-6), one obtains
d l z e z art 1 -f c V
m)
BBC* 7] THE TECHNIQUE OP DIFFERENTIATION 237
In a similar way the differentiation of the second of Eqs. (6-6) with respect to y yields
d 2 z c 2 b 2 z 2 + c?y 2
dy 2 i> 4 r 8
PROBLEMS
1. Find y\ y'\ y ,n if x* -f y* — 3axy « 0.
2. Find dz/dx, dz/dy, d 7 z/dx 2 , d 2 z/dx dy, and d 2 z/dy 2 at (1,1,1) if x 2 — y 2 -f- « 2 « 1.
3. Find dz/dx , if
(a) xz 2 — i/z 2 d- xj/ 2 z — 5 «* 0; (6) xz 8 - i/z -f~ 3xj/ « 0.
7. Change of Variables. The main purpose of this section is to develop
manipulative skill in calculating the derivatives of implicit functions and
to indicate the formal modes of attack on the problem. The continuity
of the functions and their partial derivatives is assumed throughout this
section and will not be referred to again.
T
w * f(u,v) (7-1)
denote a function of two independent variables u and v, and suppose that
u and v are connected with some other variables x and y by means of the
relations , N
X = x(?/,v),
V = 2f(w,»).
(7-2)
If Eqs. (7-2) are solved for x and y to yield
u = u(x,y),
v = v{x,y) t
(7-3)
and the expressions (7-3) are substituted for a and v in (7-1), there will
result a function of x and y r say
w = F(x } y).
(7-4)
The partial derivatives of w with respect to x and y can be calculated
from (7-4) directly, but frequently it is impracticable to obtain the solu-
tion (7-3), and we consider an indirect mode of calculation. By the rule
for the differentiation of composite functions,
dw
du
dw
dv
dw dx dw dy
dx du dy du
dw Ox dw dy
dx dv dy dv
(7-5)
The partial derivatives dx/du, dy/du, dx/dv, and dy/ dv can be calculated
from (7-2), and hence they may be regarded as known functions of u and
238
FUNCTIONS OF SEVERAL VARIABLES
[chap. 3
v . The partial derivatives in the left-hand members of (7-5) are also
known functions of u and v, since they can be calculated from (7-1).
Hence, equations (7-5) may be regarded as linear equations for the
determination of dw/dx and dw/dy. Assuming that the Jacobian J(u f v)
defined by
\dx dy\
J(u,v) m
du du
dx dy
dv dv
is not zero and solving by Cramer’s rule give
dw dy
dx dw
du du
du du
dw dy
dx dw
dw
dv dv
dw
! dv dv
dx
J(u,v)
dy
J (u,v)
The resulting expressions for dw/dx and dw/dy are known functions of
u and v and thus can be treated exactly like (7-1) if it is desirable to cal-
culate the derivatives of higher orders.
As an example, consider the function w(r,6), and let it be required to
calculate the partial derivatives of w with respect to x and ?/, where x —
r cos 0 and y ~ r sin 8. Now
dw
dw
dx
dw
dy
dw
dw
— =
= —
—
+
—
= — cos 8 +
— sin 0,
dr
dx
dr
dy
dr ~
dx
dw
dw
dx
dw
dy
dw
dw
— = —
—
+
—
= rsin
o +
— r
be
dx
d6
dy
dd ~
dx
dy
Solving these equations for dw/dx and dw/dy in terms of dw/dr and dw/dd
gives
dw dw sin 6 dw
— = cos 8 *
dx dr r d8
dw dw cos 8 dw
— = sin 8 1
dy dr r d$
The Jacobian J is, in this case,
cos 8 sin 8
— rsind rcos0
which does not vanish unless r = 0.
SEC. 7] THE TECHNIQUE OP DIFFERENTIATION 239
As a somewhat more complicated instance of implicit differentiation,
consider a pair of equations
F(?,y,u,v) * 0,
G(x,y y u,v) = 0,
(7-6)
and let it be supposed that they can be solved for u and v in terms of x
land y to yield
u = u(x,y),
(7-7)
v = v(x f y).
The partial derivatives of u and v with respect to x and y can be obtained
in the following manner. Considering x and y as the independent \ariables
and differentiating Eqs. (7-6) with respect to r and y give
dF
dF du dF dv
dF
OF du
dF dv
—
+
—
= o,
— +
b
= 0 ,
dx
dv dx dv dx
dy
du dy
dv dy
dG
dG du dG dv
dG
dG du
dG dv
—
+
— ~
= 0 ,
— +
j_
= 0 .
dx
du dx dv dx
dy
du dy
dv dy
Equations (7-8) are linear in du/dx, du/dy , dv/dx , and dv/dy . If
J(u,v) m
dF
dF
du
dv
dG
dG
du
dv
the partial derivatives in question can be determined from (7-8) by
Cramer’s rule.
A special case of Eqs. (7-6) is useful in applications. Let
x =
y = g(u,i >).
Differentiating these equations with respect to x and remembering that
x and y are independent variables, one obtains
df du df dv
1 «— — +
du dx dv dx
dg du dg dv
du dx dv dx
(7-9)
240 FUNCTIONS OF SEVERAL VARIABLES
These equations can be solved for du/dx and dv/dx if
J{u,v) a
[chap. 3
Example 1. Let
df
du
dv
dg
du
dv
u 2 — t> 2 -f- 2x « 0,
uv — y — 0.
Differentiating with respect to x,
_ / du dv \
du &V
V b u — * 0.
dx dx
du u dv v
Hence — * 5 5 , - «* —
dx u 2 -f v 2 dr u 2 -f v 2
Differentiating the first of these results with respect to x gives
du , 0 . 0t / du dv\
u 2 + v 2 ) + 2 -f v u
dx 2 (u 2 4 - V 2 ) z
u(u 2 4- v 2 ) — 2 u(u 2 — tr) uOv 3 — u 2 )
~ ’tfT+fiC " tf~Tv 2 ?'
One obtains similarly dV/dx 2 , d 2 u/dx dy y and higher derivatives.
Example 2 . Let
x « u 4-
(a)
y ** 3u 4* 2e.
Differentiating with respect to x,
du dv
1 - 7~ 4*
^ du dv
0 « 3 b 2 — »
dx dx
so that
It is easily checked that
dv
-- * 3.
dx
Equations (a) can be solved for u and v in terms of x and y, and the result is
u *» — 2 x 4 “ y,
v * 3x ~ y.
SEC* 7 ) THE TECHNIQUE OF DIFFERENTIATION 241
Regarding u and v as the independent variables and differentiating these equations with
respect to u t one finds
1
— 2 — + — >
du du
0*3
dx
du
du
Hence,
ax
du
I,
dy
du
« 3.
This agrees with the result obtained by direct differentiation of (a), as of course it should.
Note that du/dx and dx/du are not reciprocals.
Example 3. If w * uv and
u 2 4* e 4~ x * 0,
(«
v l u — y ** 0,
one can obtain dw/dx as follows: Differentiation of ir with respect to x gives
tKr du
— » u - + i» —
dx dx dx
The value* of du/dx and dv/dx can be calculated from (b) as was done in Example I.
The reader will check that
dw u 4~ 2v l dw 2u l — v
dx 1-f" 4we dy 1 4- 4m>
PROBLEMS
1. If u 2 4- v 2 4" V 2 — 2x * 0, u 3 4- e 3 — x 3 4- 3 y * 0, find du/dx, dv/dx , du/dy , and
dv/dy.
2. Find dw/dx and dw/dy if w * u/v ,
and
j*u + v ,
y * 3u 4“ 2i;.
3. Show that if f(x,y,z) * 0, then {dz/dx)(dx/dz) * 1 and {dx/dy){dy/dz)(dz/dx) * — 1.
Note that in general dz/dx and dx/dz are not reciprocals.
4. If x * x(u,v) t y * y(u,t/) with dx/du * dy/dv, and dx/dv * — dy/du } then
+ = (*z + [ /^V+ C -Vl •
du* dv 2 \dx 2 dyv L Vdu/ \dr/ J
3. Show that the expressions
Pi -
and
l r 2
d 2 Z d 2 Z
Ox* + dy*’
upon change of variable by means of x
r cos 8, y * r sin B, become
242
FUNCTIONS OF SEVERAL VARIABLES
[chap. 3
and
8. Show that
Vi
V 2
d 2 Z 1 d 2 z 1 dz
dr* + 7* d6* + r ir
d*V_ 2
* f dx*
if P /(j? -h Cl) -f ff(x
second derivatives.
7. Show that
ci), where / and g arc any functions possessing continuous
tl , *?. = /*»_ , ^I\
d.r 2 dy 2 \ dr 2 dO 2 )
if x « e r cos 0, y * c r sin 0.
8. Find du/ dx if
9. Prove that
U 2 _ t , 2 — j : 3 -f 3 y » o,
w ~f e> — ?/ 2 — 2.r - 0.
du dy dr; dy
dx du dx dv
if Fix^y^v) * 0 and G{i,y,u,v) = 0.
10. If Vi(x,y,z) and V 2 {x,y,z) satisfy the equation
V 2 T
d 2 V d 2 V d 2 K
dx 2 + dy 2 + l)z 2
0,
then
u - Fi(i,y,*) + (J- 2 + + **)K,(jr,jr,*)
satisfies the equation
where
v 2 v 2 r « o,
V 2
d 2 a 2 a 2
ttr 2 ^ d.y 2 dz 2
11. To indicate explicitly the variables entering in the Jacobian
J(u,v)
dx dy
du du
dx dy
dv dv
one frequently writes J(u,v)
The Jacobian
du
dv
dx
dx
du
dv
dy
dy
of the transformation (7-3) is written as J(x,y) ** J (—] <
\%,y/
Prove that:
APPLICATIONS OP DIFFERENTIATION
243
SEC. 8]
where u - «(x,y), v « *(x, 2 /), x - x(£,t>), and y « y(£, v ). Hint: Write out the Jaeobians
and multiply.
APPLICATIONS OF DIFFERENTIATION
8. Directional Derivatives. Formula (4-2) has a simple geometrical
meaning when interpreted as the space rate of change of a given function
u(x,y). Thus, let u(x,y) be specified along a smooth curve C with para-
metric equations _
j — x(s),
( 8 - 1 )
v - v(*),
where s is the arc-parameter measured along C. By virtue of Eqs. (8-1),
u(x,y) can be regarded as a function of s and the rate of change of u(x,y)
along C is
dn du dx du du
— - — - + - v* (8-2)
ds dx ds dy ds
At a given point Po(x 0 ,yo) on C, Eq. (8-2) yields
Ux(xoSJo) cos a + u y (x 0 ,yo) sin a,
since dx/ds = cos a and dy/ds * sin a, as is clear from Fig. 5. It follows
from (8-3) that the rate of change of
u(x,y) at a given point depends only on
the direction of the curve passing through
that point. If the direction of C is that
of the x axis, the angle a — 0 and du/ds
= du/dx ; if the direction of C is that of
the y axis, a — w/2 and du/ds *= du/dy.
For an arbitrary direction specified by a,
Eq. (8-3) defines the directional derivative
of u(x,y) in that direction. Thus the de-
rivatives u x and u v are directional deriva-
tives in the directions of the coordinate Fig. 5
axes indicated by the subscripts.
We now ask the question: What is the angle a for which the directional
derivative of u{x f y) at a given point has a maximum value? Since a
necessary condition for a maximum is the vanishing of the derivative of
(8-3) with respect to a, we get the equation
244
FUNCTIONS OF SEVERAL VARIABLES
[CHAP. 3
—Ux(x 0 ,!/o) sin a + u y (x 0 ,yo) cos a = 0,
from which wo conclude that when u x (x 0 ,yo) 9* 0,
A Uv(x 0 ,y 0 )
tan a =
u x (x o,Vo)
( 8 - 4 )
Accordingly, there are two values of a differing by 180° which satisfy the
condition (8-4). The corresponding values of cos a and sin a in (8-3),
therefore, are
cos a
dr VL X
sin a =
dr U y
'Vul + U 2 u
( 8 - 5 )
The substitution in (8-3) of the values from (8-5) with the plus sign
the desired maximum
= V'u* -f ub
yields
( 8 - 6 )
while the other pair of values in (8-5) gives a minimum
- Vul + uj-
The vector pointing in the direction of the greatest rate of
u(x,y) at a given point (x,y) and whose length is determined
called the gradient , and
is called the normal derivative . 1
increase of
by (8-6) is
We denote
the normal derivative by dafdn and write
da r-~ ~
— - Vu 2 x + vi- (8-7)
dn
A similar discussion can be applied to a differentiable function u{x,y y z)
defined along a space curve C with parametric equations
x ~ x(n),
V * 0(s),
Z «* z(s).
We get
du du dx du dy du dz
ds dx ds dy ds dz ds
1 The reason for this terminology is given in Chap. 5, Sec. 3.
( 8 - 8 )
( 8 - 9 )
SEC. 8] APPLICATIONS OF DIFFERENTIATION 245
where (see Fig. 6)
dx dy dz
— * cos (*,«), — « cos (y,s), — = cos (z,s) (8-10)
ds ds ds
are the direction cosines of the tangent line T at a given point P of C.
To determine the particular direction yielding a maximum of (8-9) at a
given point P(xo,y 0 ,Zo), we must maximize the resulting function of the
Fig. 0
direction cosines in (8-9). This problem, involving the determination of
a maximum of functions of several variables, is discussed m Examph 3 ;
Sec. 10, where it is shown that
for (8-9) is given by the formula
U 2 y + Vil ,
analogous to (8-7). The expression (8-11) is called the normal derivative
of u.
Example 1. Find the directional derivative for u{x,y) * x 2 -f y 2 at (1,1) in the direc-
tion making the angle of 30° with the positive x axis.
Formula (3-3) yields
— I -2*1 cos 30° + 2y\ sin 30° - V3 + 1.
ds l(u) la.i) i(U)
The normal derivative at this point, as found with the aid of formula (8-7), is
^ - V (2*) 1 + (2k)* I - 2 V2
an l(i,D
and the corresponding angle a, as follows from (8-4), is 45°.
Example 2. Find the directional derivative of u(x f y,z) * xyz at (1,2,3) in the direc-
tion of the line making equal angles with the coordinate axes. Since the angles are
equal and the sum of the squares of the direction cosines is 1, we conclude from
cos 2 Cm) *f cos 2 (y,s) -F cos 2 (z,s) « 1
246 FUNCTIONS OF SEVERAL VARIABLES [CHAP. 3
that cos ( x t s ) « cos ( y,s ) *» cos (z,$) * l/\/3. Also, at the point (1,2,3) we have
du . du
yz ** 6, -~
dx dy
The substitution in (8-9) then yields
du
ds
*■ zz ** 3,
du
~dz
» xy — 2.
6 3 2 11_
V3 + V3 + Vs ~ Vs
Example 3. Show that the directional derivative of u(x,y ) in two noncollinear direc-
tions determines the derivative in all directions.
Let the derivative be given for directions «o and aj, so that
u x cos «o ~h Uy sin ao =» a,
u x cos a\ -f Uy sin a j * b,
where a and 6 are known. If these are regarded as equations for the unknowns u z and
the coefficient determinant is
cos ao sin ao 1
| j *= COS ao Sin a\ — COS aj Sin ao.
I cos ai sin aj |
This reduces to sin (ai — ao), which is zero only if the two directions are collinear. Hence
u x and tty can be found, and the directional derivative is determined for every direction
by (8-3).
PROBLEMS
1. Find the directional derivative of f(x,y) ** x?y 4 sin xy at (l,»/2), in the direction
of the line making an angle of 45° with the positive x axis.
2. Find
I-VSHI ) 5
if x » r cos 0, y ■* r sin 0, and / is a function of the variables r and 0.
3. Find the directional derivative of f{x y y ) * x*y 4 e vx in the direction of the curve
which, at the point (1,1), makes an angle of 30° with the x axis.
4 . Find the normal derivative of u » j : 2 4 y 1 4 z 2 at the point (1,2,3) and the direc-
tional derivatives at that point along the line joining (0,0,0) and (1,2,3).
9. Maxima and Minima of Functions of Several Variables. A function
f{x y y) defined in a region R is said to have a relative maximum at a point
(o,6) if A/ = f(a + h,b + k)~ f(a,h) < 0 (9-1)
for all values of h and k in the neighborhood of ( a,b ). It is said to have a
relative minimum at (a,b) if
Af s f(a 4- h,b 4 k) — f(a, b) > 0 (9-2)
for all values {h,k) in the neighborhood of (a, b).
The requirement that the inequalities (9-1) and (9-2) hold for all values
(h,k) in the neighborhood of (a,b) implies that we are concerned here
APPLICATIONS OF DIFFERENTIATION
247
SEC. 9]
only with the interior and not the boundary points of the region. A func-
tion may attain a maximum or a minimum value on the boundary of the
region, but the behavior of functions on the boundary requires a separate
investigation, the nature of which will be clear from the sequel. The
greatest and least values assumed by f(x 9 y) in the closed region are called,
respectively, the absolute maximum and the absolute minimum. In the
following discussion we dispense with the adjective “relative,” and we
shall refer to relative maxima and minima simply as maxima and minima.
Let it be assumed that f(x,y) attains a maximum (or minimum) at some
interior point (a, b). Then the func-
tion f(x,b) of the variable x must
attain a maximum (or minimum) at
x ~ a. From the study of functions
of one variable it follows that the
derivative of f(x,b), if it exists, must
vanish at x = a. The derivative may
cease to exist at the critical points
when the behavior of the function is
like that shown in Fig 7 in the neigh-
borhood of x — ait x “ a 2i and x = c/ 3 .
a maximum (or minimum) of f{x,b) at x
Thus, a necessary condition for
= a is that
d[
dx
0
(9-3)
if this derivative exists at x — a.
A similar consideration of the function /(a,y) leads to the conclusion
that
df
— = 0 at y = b ( 9 - 4 )
dy
whenever this derivative exists.
The coordinates (a, b) thus satisfy the pair of equations
df
dy
(9-5)
at any point ( a,b ) where /(a*, y) attains a maximum or minimum.
This discussion is capable of extension to functions of any number of
variables to yield a theorem.
Theorem. A function f(x i,x 2 , . . . f x n ) of n independent variables x t attains
a maximum or a minimum only for those values of the variables Xifor which
fx v fx it . . fz n either vanish simultaneously or cease to exist.
We emphasize that the conditions stated in this theorem are necessary
248
FUNCTIONS OF SEVERAL VARIABLES
[CHAP. 3
but not sufficient for a maximum or a minimum. 1 Although the matter of
sufficiency can usually be determined from the nature of the problem and
from physical considerations leading to its formulation, we record here a
test that may prove useful to settle doubtful cases. 2
If /Or,?/) is a function with continuous second partial derivatives, and if
/*(<!,&) ~ 0 and f y (a,b) ~ 0 , then /(a, b) is a maximum provided that
D = fl y (a,b) - fxx(a,b)fy V {a,b) < 0
and f TX (o i t b) < 0, f vy (a } b) < 0; it is a minimum if D < 0 and f xx {a,b ) > 0,
>0; it is neither maximum nor minimum (a saddle point) if
1) > 0. This test gives no information if 1) = 0, just as the condition
/"(a) — 0 gives no information for the function }{x) with /'(a) — 0.
Before proceeding to a further study of maxima and minima, we give
two examples illustrating the developments of this section.
Example 1. A long piece of tin 12 in
wide ib made into a trough by bending up
the sides to form equal angles with the base
(Fig. 8). Find the amount to be bent up
and the angle of inclination of the sides that
will make the carrying capacity a maximum.
The volume will be a maximum if the
area of the trapezoidal cross section is a
maximum The area is
A *» I2x sm 0 — 2x 2 sin sm 6 cos 6,
for 12 — 2x is the lower base, 12 — 2-c -F 2x cos 6 is the upper base, and x sin 6 is the
altitude. Then,
dA ,, ,, „
— « 12a- cos 6 — 2j" cos Hr cos* ft — x l sirr 6
06
— r( 12 cos 6 — 2x cos 6 ■+* x cos ; 0 — x sin 2 6)
, 0A . ,
and — — 2 sm 0 ( 0 — 2x + x cos 6).
Ox
Now OA/Ox * 0 and dA 1 06 « 0 if sin 0 — 0 and x «* 0, which, from physical considera-
tions, cannot give a maximum.
There remain to be satisfied
6 — 2.r -f x cos 6 «= 0
and
12 cos 6 — 2x cos 6 + x cos 2 6 ~ x sin 2 6 ~ 0
Solving the first equation for x and substituting in the second yield, upon simplification,
cos 6 * or 6 — 00°, and x * 4.
Since physical considerations show that a maximum exists, x ~ \ and 0 ~ 60° must
give the maximum.
1 Recall, for example, the situation when f(x) has a point of inflection with a horizontal
tangent,
, * A proof and further discussion are contained in I. S. Sokolnikoff, “Advanced Calcu*
' hi»/' sec, 89, McGraw-Hill Book Company, Inc., New York, 1939.
249
SEC. 10 ] APPLICATIONS OF MFFEHENTIATXOM
Example 2. Find the maxima and minima of the surface
x 2 i i L
- ?5 - 2 <*.
Now,
which vanish when x « y
*
dr 2 *
6 2
dz 1 2
dx c a 2
* 0. But
_I_ dh
ah' dy 2
to
dy
Wc
1 V
cb *’
d»g
dx dy
0.
Hence, D ==> l/a 2 & 2 c 2 , and consequently, there is no maximum or minimum at x =* y « 0.
The surface under consideration is a saddle-shaped surface called a hyperbolic paraboloid.
The points for which the first partial derivatives vanish and /) > 0 are called minimax .
The reason for this odd name appears from a consideration of the shape of the hyperbolic
paraboloid neur the origin of the coordinate system. The reader will benefit from sketch-
ing it in the vicinity of (0,0,0).
PROBLEMS
1. Divide a into three parts such that their product is a maximum. Test by using
the second-derivative criterion.
2. Find the volume of the largest rectangular parallelepiped that can be inscribed in
the ellipsoid
a* ^ b* ^ c*
= 1.
3. Fmd the dimensions of the largest rectangular parallelepiped that has three faces
in the coordinate planes arid one vortex m the plane
x y z
~ + 7 -f “
a b c
1.
4. A pentagonal frame is composed of a rectangle surmounted by an isosceles triangle.
What are the dimensions for maximum area of the pentagon if the perimeter is given as P?
5. A floating anchorage is designed with a body in the form of a right-circular cylinder
with equal ends that are right-circular cones. If the volume is given, find the dimensions
giving the minimum surface area.
6. Given n points P, whose coordinates are (x if yi,z t ) (i ~ 1,2, . . ., n). Show that the
coordinates of the point P(x,y } z ), such that the sum of the squares of the distances from
P to the P* is a minimum, are given by
10. Constrained Maxima and Minima. The discussion in the preceding
section was confined to the calculation of the maximum and mi nimum
values of functions of several independent variables. In a large number
of investigations, it is required that the maximum and minimum values
of a differentiable function f(z . . ,x n ) be found when the variables Zi
are connected by some functional relationships, so that the %i are no longer
250
FUNCTIONS OF SEVERAL VARIABLES
[CHAP. 3
independent. Such problems are called problems in constrained maxima
to distinguish them from the problems in free maxima discussed in Sec. 9.
To avoid circumlocution, we shall speak of the maximum or minimum
values as the extreme values. Thus, let us consider the problem of finding
the extreme values of the function
u « f(x,y,z ), (10-1)
in which the variables x, y f z are constrained by the relation
*p(x,y>z) ~ 0 . ( 10 - 2 )
This problem can be solved by the procedure of Sec. 9 as follows: Suppose
that the constraining relation (10-2) is solved for one of the variables,
say z f to yield a differentiable function
s « (10-3)
If one substitutes z from (10-3) in (10-1), there results the function
y = f[x,yMx,v)] - F(*,y) ( 10 - 4 )
of two independent variables x , y to which the considerations of Sec. 9
apply.
However, either the solution (10-3) may be difficult to obtain or the
function F(x,y) in (10-4) may be so unwieldy that the simultaneous equa-
tions F x (x,y) — 0, F y (x,y) — 0 are unpleasant to deal with. In this
event an ingenious method devised by the great F reneh analyst Lagrange
often leads to a manageable* and symmetric system of equations for the
determination of extreme values. The central idea of the method hinges
on the following observation. In Sec. 9, we saw that a necessary condition
for a relative extremum of the differentiable function fixity. . . ,x„) of
n independent variables is the simultaneous vanishing of all partial deriva-
tives f Xt . Inasmuch as the total differential of /is
df = f Xl dx i + } Xt dx 2 H h fx n dx n ,
it is clear that df = 0 whenever each f x% ■= 0. Conversely, if df = 0, the
partial derivatives f x% vanish, since the dx % are independent. But it is
also true that the vanishing of the total differential is a necessary condi-
tion for an extremum of f(x i,T 2 ,...,x n ) even when the variables x t are
dependent because of the invariant character of df stated in the theorem
of Sec. 5. We can thus state a theorem:
Theorem. A necessary condition for an extremum of a differentiable func-
tion f(x h X 2 , . . . ,x n ) is the vanishing of its total differential at the maximum and
minimum points of the function.
We proceed now to a discussion of the method of Lagrange multipliers
for determining the extreme values of the function in (10-1) subject to
the equation of constraint (10-2).
SBC. 10] APPLICATIONS OF DIFFERENTIATION 251
By the theorem just stated, the differential of (10-1) vanishes at the
critical points so that
df df df
— dx + —dy + -dz~ 0. (10-5)
dx dy dz
Also, since <p(x } y,z) — 0, its total differential vanishes and we can write
dtp dtp dip
~ dx H dy H dz = 0.
dx dy dz
( 10 - 6 )
Let Eq. (10-6) be multiplied by some parameter X and then added to
(10-5). The result is
(* + + (* + X *U + + X s ')* , 0.
\dx dx/ \dy dy/ \dz dz)
(10-7)
if we regard x and y as independent variables, and suppose that d<p/dz ^ 0
at the point where the extremum is attained, then we can find a X such
that at this point
df dip
— + X —
dz dz
= 0 .
(10-8)
With this choice of X, Eq. (10-7) reduces to
/ df d<p\ / df dip\
( b X — ) dx l f* X — ) dy = 0.
\dx dx) \dy dy)
But since dx and dy are independent increments, we conclude from this
equation that
df d<p
~ + X— = 0,
dx dx
dy
dip
+ X — - 0.
dy
(10-9)
The system of three equations (10-8) and (10-9) contains four unknowns
x, ?/, z , X, and we must adjoin to it the fourth equation (10-2) to obtain the
complete system for the determination of the unknowns
If dip/dz = 0 at the point where the extremum is attained, but dip/dy y* 0,
the roles of z and y in the foregoing discussion are interchanged. Clearly,
the method will fail to yield the desired value of X when <p Xf <p yy and <p *
vanish simultaneously at the point where f{x,y,z) has an extremum.
Before proceeding to extend the Lagrange method to the study of
extreme values of functions with several constraining conditions, we con-
sider four instructive examples.
252 FUNCTIONS OF SEVERAL VARIABLES [CHAP. 3
Example 1. Find the maximum and the minimum distances from the origin to the
curve
5x 2 4 fay 4 by 1 - 8 * 0.
The problem here is to determine the extreme values of
f(x,y) « x 2 4 y 2
subject to the condition
<p(x } y) ss 5x 2 -f 6xjy 4 5j/ 2 - 8 « D.
Equations (10-9) and (10-2) in this case read
2x 4 X(U)x -f f>y) « 0,
2 y 4 X(6x 4 10 y) - 0,
5x 2 4* fay 4 by 2 — 8 * 0.
Multiplying the first of these equations by y and the second by x and then subtracting
give
6 X(y 2 — x 2 ) «• 0,
so that y ® rkx. Substituting these values of y in the third equation gives two equations
for the determination of x, namely,
2x 2 « 1 and x 2 = 2.
The first of these gives / &e x 2 4- ?/ 2 — 1, and the second gives / e- x? 4* ?/ 2 * 4. Obvi-
ously, the first value is a minimum, whereas the second is a maximum. The curve is an
ellipse of semiaxes 2 and 1 whose major axis makes an angle of 45° with the x axis.
Example 2. Find the dimensions of the rectangular box, without a top, of maximum
capacity whose surface is 108 in. 2
The function to be maximized is
f(x t y,z) rn xyz ,
subject to the condition
xy 4 2xz 4 2 yz « 108. (10-10)
Equations (10-8) and (10-9) yield
yz 4- My 4 2 z) ® o,
xz 4 X(x 4 2 z) * 0, (10-11)
xy 4- X(2x 4 2 y) « 0.
In order to solve these equations, multiply the first by x, the second by y } and the last
by 2 , and add. There results
X(2 xy 4 4xz 4 4yz) 4 3 xyz « 0,
or X (xy 4 2xz 4 2 yz) 4 %xyz » 0.
Substituting from (10-10) gives
108X 4 Hxyz * 0,
or
X
xyz
72
BMC. 10 ] APPLICATIONS OF DIFFERENTIATION
Substituting this value of X in (10-11) and dividing out common factors give
253
1 -~(y+2z) -o,
1 -~(*+W -0,
1 _ -L (2z + 2 y) -0.
From the first two of these equations, it is evident that x » y. The substitution of x « y
in the third equation gives z * 18 /y. Substituting for y and z in the first equation yields
x » f>. Thus, x ** 0, y =» 6, and 2 * 3 give the desired dimensions.
Example 3. Show that the maximum value of the directional derivative of u(x,y t z)
at any point is given by
~ - V^TufT4.
We write the directional derivative [see Eq. (8-9)] in the form
du
f(<x,P,y) 22 ~ « u x cos a ~f ii v cos /3 4- u g cos y, (10-12)
where cos a « cos (r,s), cos# * cos(y,«), cos 7 - cos (2,8), and maximize /(a, 0,7) sub-
ject to the constraining condition
*>(«,£, 7) 23 cos 2 a -f- cos 2 /9 4~ cos 2 7 — 1 =» 0. (10-13)
The system of Eqs (10-8) and (10-9) then yields
— u x sin a - 2\ cos a sin a «* 0,
— Uy sin /3 — 2X cos 0 sin P = 0, (10-14)
— u* sin 7 — 2X cos 7 sin 7=0.
The case when either sin a, sin j8, or sin 7 vanishes is trivial because of the constraining
condition (10-13). Thus, the system (10-14) reduces to
u x =» 2X cos a, Uy * 2X cos u t ** 2X cos 7, (10-15)
and we conclude that
4 + 4 + u\
4X 2 .
Thus, X * 4- wj + uf, and the substitution of this value of X in (10-15) gives
cos 0
Vu| 4~ a 2 4- w? + *4 + u l
On inserting these values in (10-12) we get the desired result
cos 7
u t
Vu'i 4- U 2 V -F u\
254
FUNCTIONS OF SEVERAL VARIABLES
Example 4. Find the shortest distance from the origin to the curve y
in Fig. 9. We apply the procedure employed in Example 1 to minimize
f{x,y) - x 2 4- y 2
subject to the constraining condition
<p(*,y) m if - (x - l) 8 « 0.
Equations (10-9) now yield
2x - 3X(x - l) 2 - 0,
2 y -f 2 Ay * 0,
[chap. 3
« (x ~ D*
(10-16)
(10-17)
(10-18)
which must be solved together with (10-17). The
system (10-17) and (10-18) has no solutions for#, y ,
and X This becomes obvious on noting that the mini-
mum is attained at x » 1, y * 0, and if we insert
these values in (10-18), the first of the resulting equa-
tions yields a nonsensical result 2=0 while the second
is true for all values of X. The reason that the La-
grange method this time has failed to give the solution
is simple. The method depends ou the assumption
that not both <p x and <p v vanish at the point where
the extremum is attained. In our case *>*(1,0) = 0
and <p„(1,0) = 0. The moral of this example is that the Lagrange method yields the
solution of the problem only when the system of Eqs. (10-8) and (10-9) can be solved
for X.
PROBLEMS
1. Work Probs. 1, 2, and 3, Sec. 9, by using Lagrangian multipliers.
2 . Prove that the point of intersection of the medians of a triangle possesses the prop-
erty that the sum of the squares of its distances from the vertices is a minimum.
3 . Find the maximum and the minimum of the sum of the angles made by a line from
the origin with (a) the coordinate axes of a cartesian system, (b) the coordinate planes
4 . Find the maximum distance from the origin to the folium of Descartes x s -f y 3 -
3 axy = 0 ,
6. Find the shortest distance from the origin to the plane
ax -p by -f cz ** d.
11. Lagrange Multipliers. We now extend the considerations of Sec. 10
to cases where the extremum of the function f(zi,x 2 , . . . ,x n ) is sought under
several conditions of constraint.
We consider first the function
w = f(x,y,u,v), (11-1)
in which the variables are constrained by two relations
<Pi{x,y,u,v) = 0,
V 2 (x,y,u,v) = 0.
(11-2)
SEC. 11]
APPLICATIONS OF DIFFERENTIATION
255
If w takes on the extreme values for certain values of (x,y } u,v), then for
such values
df df df df
— dx -j dy d du dv = 0, (11-3)
dx dy dudv
by the theorem in the preceding section. Also, (1 1-2) yields two equations;
dtp 1 d<pi dtp i d<p\
~~dz + — -dy + ~^du + — dv - 0
dx dy du dv
d<p2 dtp 2 dtp2 dtp2
dx - — dy "|~ — '■*' du -f* — dv — 0.
dx dy du dv
(11-4)
We multiply the first of these by Xi and the second by X 2 , add the results
to (11-3), and obtain
Now, if
d<P2\
J , (df
d<pi
+ x 2 — )
dx + ( —
+ x
-L \ 2 J dy
dx)
Va y
a?/
dy/
(df
dtp i
d(p2\
+
(— + x,
+ x 2 -
— )dv
\du
du
du)
/df
dtp\ d<pz\
+
l- + x,-
— + Xo — dv
\dv
dv dv /
dtpi
d<Pi
du
dv
J(u,v) =
* 0,
dtp 2
CS 1
<T> |
du
dv i
0. (11-5)
the values of Xj and X 2 can be found such that
df dtpi dtp2
i + Xi +X =0,
du du dxi
df dtpi dtp 2
~ + h + x 2 — - 0,
dv dv dv
( 11 - 6 )
and accordingly (11-5) reduces to the sum of two terms involving arbitrary
differentials dx and dy. The fact that they are arbitrary enables us to
conclude that
df dtpi dtp 2
256
FUNCTIONS OF SEVERAL VARIABLES
[chap. 3
The system of six equations (11-6), (11-7), and (11-2) serves to determine
the parameters Xi, X 2 and the point (x,y;u,v) at which the extreme is at-
tained.
The foregoing procedure may be extended to cover the case of more
than two constraining conditions and we obtain the following rule:
Rule. In order to determine the extreme values of a function
f(xi,x 2t . . .,x„)
whose variables are subjected to m constraining relations
(11-8)
^i;2r*>;») =0, i * 1, 2, . . m,
form the function
m
F = / + X,v>»
»»i
(11-9)
and determine the parameters X, and the values of Xi, x 2 , . . .
equations
dF
, x n from the n
— = 0, j - 1, 2,
dXj
(11-10)
and the m equations (11-9).
It should be carefully noted that the applicability of this rule to specific
problems depends on the possibility of determining the multipliers X».
The existence of the X x was established above only under the hypothesis
that J 0.
Example: As an illustration, consider the problem of determining the maximum and
the minimum distances from the origin to the curve of intersection of the elUjasoid
x 2 v 2 z 2
l i - j 3 }
a 2 ^ 6 2 * c*
with the plane
Ax *f By -f Cz * 0.
The square of the distance from the origin to any point ( x,y,z ) is
/ * x 2 + y 1 + * 2 ,
and it is necessary to find the extreme values of this function when the point (x,y t z) is
common to the ellipsoid and the plane. The constraining relations are, therefore,
x 2 y 2 z 2
a 2 ^ b 2 ^ c 2
1
(a)
and
(b) ip 2 m Ax 4" By ■+* Cz » 0.
The function F ** f - f -f Mw is, in this case,
*£ z?
F - x 2 + y 2 + z 2 + Xi ^ ~ - l) + 2 MAx + By + Cz),
APPLICATIONS OP DIFFERENTIATION
SBC. 12]
257
where the factor of 2 is introduced in the last term for convenience. Equations (11-10)
then become
x 4“ Xi 4- XsA “ 0,
or
(0
V + Xi ^ + XiB — 0,
* -f* Xi -r -f- X 2 C *» 0,
cr
These equations, together with (o) and ( b ), give five equations for the determination
of the five unknowns x, y, z, Xi, and X 2 . If the first, second, and third of equations (c)
are multiplied by x , y, and z, respectively, and then added, there results
o* + h + ?) + ^ (Ax + By + Cz) ■ °-
Making use of (a) and (6), it is evident that
X, = -(z* + y* + z *) - -/.
Setting this value of Xj m ( c ) and solving for x } t/, and z,
y + XjB = 0,
* 0 - i) + x * c - °-
X 2 A a 2
n? —
f
\ z Bb 2
MCc 2
c 2 — /
When these values of x t y f and z are substituted in (b), one obtains
yy By cy
a 2 — / 6 2 — / + r 2 ~ /
from which /can be readily determined by solving the quadratic equation inf.
PROBLEMS
1. Find the point P, in the plane of the triangle ABC , for which the sum of the dis-
tances from the vertices is a minimum. 1
2. Find the triangle of minimum perimeter which can be inscribed in a given triangle.
12. Taylor’s Formula for Functions of Several Variables. Let f(x 9 y) be
a function of two variables x and y that is continuous in the neighborhood
of the point (a, 6) and that has continuous partial derivatives, up to and
including those of order n, in the vicinity of this point.
1 See E. Goursat’s “Mathematical Analysis/' English ed., vol. 1, p. 130, for a detailed
discussion of this interesting problem.
258 FUNCTIONS OF SEVERAL VARIABLES fcHAP. 3
If a new independent variable t is introduced with the aid of the relations
x = a + at, y = b + fit, (12-1)
where a and fi are constants, a function of the single variable t will result,
namely,
m - f(x,y) = f(a + at, b + 0t). (12-2)
Expanding F(t) with the aid of the Maelaurin formula gives
F"( 0) , F in) (dt)
F(t) = F{ 0) + F'(0)l + — ^ t 2 + • • • + t n , (12-3)
2! n!
where 0 < 8 < 1.
It follows from (12-1) and (12-2) that 1
dx dy
F'(t) =f*(x,y)-+f v (z,y)-
dt dt
* fx(x,y)a + f y (r,y)p.
Calculating F"(t) and from this expression gives
dx dy
F”(t) - [fxz(r,y)a + fyz(x,y)p] — + lfxp(x,y)a + jf w 0r,y)/3] ~~
dt dt
, = fzx(x,y)a 2 + 2 f zy (x,y)a0 + f m (x,y)0 2 ,
and
„ „ dx
F"'(t) - Uxxx{x,y)a 2 + 2f xvz (*,y)<xf} + f n *(T,y)P 2 ]-
lfxxy(x,y) ci “H 2f zyy (x,y)a0 -f- f yyy {x,y)0]
at
= fxrx(x,y)a 3 + 3f zzy (x,y)a 2 0 + 3f zyl/ (x,y)a0 2 + f yyy (x,y)0*.
Higher-order derivatives of F[t) can be obtained by continuing this
process, but the form is evident from those already obtained. Symbolically
expressed,
F'(t) =(a~ + P Pl f(x,y)
\ dx dy/
1 d
d N
2
k
F'\t) -(«-
+ 18 —
\ dx
dy/
f
{ d
d\
3
i
F”'(t) = (a-
+ 0 —
J f(x,V)
V dx
dy)
df df
dx dy
— ,_2
d 2 f
d 2 f
or + 2 a|3 - — ~ + 0 *
dx 1
d*f
a" — - + 3 c?0
dx dy
d*f
dx*
dx 2 dy
+ 3ajS 2
dy 2
d*f
dx dy 2
+ ,
d*f
dy*
1 See Sec. 4.
SBC. 12]
Then
F lB, (<) =
APPLICATIONS OP DIFFEKBNTIATION
259
a a\ n a n f a n f
T + P T ) f(x,y) m a" — + ^
dx by! dx n dx n 1 <
dx n dx n ^dy
d n f d n f
+ • • • ■ + —4^ + r -L
dx by 1 dy n
where
r!(n — r)!
Since t — 0 gives x — a and y = b,
F( 0) - f(a,b), F’{ 0) = af x (a,b) + 0f v (a,b), ....
Substituting these expressions in (12-3) gives
F(t) m f(x,y) - f(a, b) + [of Mb) + 0f y (a,b)]t
+ l<x 2 f ZI (a,b) + 2a0f zy (a,b) + 0% y (a,b)) L + • ■ • + R n ,
Z !
r ( a ay
where R„ *= - la \- 0 — ) j
n\ \ dx du /
f(a + Bat, b + 60t).
Since at — x — a and 0 t — y — b, the expansion becomes
J{x,y) = /(«,&) + f z {a,b){x - a) + f v (a,b)(y - b)
+ ~j Uxx(a,b)(x. - a) 2 + 2 f xy (a,b)(x - a)(j/ - 6) + f vv (a,b)(y - b) 2 ]
+ ---+R n . ( 12 - 4 )
This is Taylor's expansion for a function f(x,y) about the point (a,b).
Another useful form of (12-4) is obtained by replacing x — a by h and y — b
by k, so that x = a + h and y = b + k. Then,
f(a 4- h,b + k) = f(a,b) 4- fx(a,b)h -f f v {a,b)k
+ - ( JrMb)h 2 4- 2 U(a,b)hk+f uy (a,b)k 2 ]
Z !
where R
—u.
n! \
d d \ n
fl .-j- ^ Qfa* Jy _J_
dx du/
+ ---+Rn, ( 12 - 5 )
n! \ dx dy/
This formula is frequently written symbolically as
/(a + h,b + k)- f(a,b) + (h~-{-k ~)f(a,b )
\ dx dy/
1 / d d\ 2
+ 2\\ h 7x + k Ju) f{a,b)+ '" +Rn -
260
FUNCTIONS OF SEVERAL VARIABLES
[chap. 3
In particular, if the point (<t,&) is (0,0), the formula (12-4) reads
/<*,¥) =“ /(0,0) + /,( 0 , 0 )* + f v (0,0)y
+ “ f/xx(0,0)x 2 +2f xv (0,0)xy + f vv (0,0)y 2 ]
+ * * * + Rn, (12-6)
1 / d d\ n
where R n = ~~ [x h y — ) f(0x f $y) f 0 < 6 < 1.
n\ \ dx dy/
This development is known as the Maclaurin formula for functions of two
variables. It is seen from (12-6) that the Maclaurin formula expresses
the function f(x,y) in a series each term of which is a homogeneous poly-
nomial in x and y.
The procedure outlined above can be generalized easily to yield similar
expansions for functions of more than two variables.
Example: Obtain the expansion of tan"" 1 ( y/x ) about (1,1) up to the third-degree terms:
y
f(x,y) = tan 1
X
/(1,1) - tan- 1
U**)" - x * + /
/x(l,l) - - \ ;
/»(!.') —
, , N 2 xy
f X z(x,y) - +
i-i je*
1!
s
V 2 - J 2
fxyix,y) = +
VU) = 0;
fnU ’ V) “ (x r + !/ 2 ) 2 ’
«U) - -
xX_ j-i(x-l)+id,-l)+Afl(r-l)*-i(v-l)*]
PROBLEMS
1. Obtain the expansion for xy 2 + cos xy about (1 ,tt/ 2) up to the third-degree terms.
2. Expand f(x, y) — e ** at (1,1), obtaining three f orms
3. Expand e x cos y at (0,0) up to the fourth-degree terms.
4. Show that for small values of x and y
e x sin y y *f xy (approx),
u 2
and e T log (1 + y) « y + xy - ~~ (approx).
3. Expand f(x y y) « x z y j- x 2 y -f 1 about (0,1).
6. Expand Vl — x 2 r — y 2 about (0,0) up to the third-degree terms.
&BC. 13] INTEGRALS WITH SEVERAL VARIABLES 261
7. Show that the development obtained in Prob. 6 agrees with the binomial expansion
of [1 - (** + y 2 ))*
INTEGRALS WITH SEVERAL VARIABLES
13. Differentiation under the Integral Sign. The fundamental theorem
of integral calculus states that whenever f(x) is a continuous function in
the closed interval (a, 6) and F(x) is any function such that F'(x) = f(x),
then
[ Ul f(r)dz « F( Ui ) - F(u o) ( 13 - 1 )
for any two points uq and u } in the interval. If u 0 and u t are differentiable
functions of another variable a, so that
= Uo(<x), u i « Ui (<*),
the right-hand member in (13-1) is a function of a and the chain rule gives
dF (tq) du\ dui
—j - 1 - F\ui) — - /(Kj)
aa da da
Since a similar result holds for F(u { )), differentiation of (13-1) yields the
important formula
d
da
ru^a) du\
/ ftfdx-ffa) —
da
- /(« o)
duo
da
(13-2)
If the variable a in (13-1) occurs under the integral sign, so that the
integral takes the form
*l«) - [ Ul J(T t a)dx % (13-3)
Ju {)
we can compute the derivative of <p(a) by calculating the limit of the
difference quotient A«p/A« as Aa — > 0 This calculation is simple when
the limits u< h // j are constant. Indeed in this case (13-3) gives
r u i f u i
A<^ - <p(a + Aa) — <p(a) = / f(x, a + Aa) (lx - / f(x,a) dx
Ju„ Juo
= f [/Or, a + Aa) - f(x ,a)] dx.
J Uq
Dividing by Aa and taking the limit as Aa — ► 0 give
<p ( a + Aa) — <p(a) <* + Aa) — /(x f at)
<p'(a) ?? lim = hm / dx
a« *-♦ o Aa a« -+ o yw o Aa
(13-4)
provided the limit on the right exists.
262
FUNCTIONS OF SEVERAL VARIABLES
[CHAP. 3
If we knew that
ru, f{x, a + Aa)
lira /“*
■ /(*,«)
Aa J *t> Aa o
then the right-hand member of (13-4) would give
ru, df
, r u i .. /fo a + Aa) - /(x)
ax ~ / lim ax,
JUn
Aa
(13-5)
L
uo da
-dx
by the definition of partial derivative. We could then conclude that
d ru, ru,
— I /(x,a) dx~ /«(x,a) dx, u 0 , iq const. (13-0)
da Ju q Ju o
Interchanging an integral with a limit operation as in (13-5) is not valid
in general, 1 but the equality of (13-5) can, in fact, be justified when f a (x, a)
is continuous, and hence (13-6) holds in that case.
Equation (13-2) requires that the integrand be independent of a, while
(13-6) assumes that the limits of integration are independent of a.
When the limits and also the integrand depend on a, it can be shown 2
that the correct formula is given by addition of (13-2) and (13-6); namely,
«, (a)
dux
, , , s /(*»«) dx = /(« i»«) —
da J u 0 (a) da
/.
du 0 ru,(a)
/(« 0,«) — + / , faM dx (13-7)
da y u 0 (a)
provided that uo{a) and iq(a) are differentiable and /(x,a) and / 0 (x,a)
are continuous. The formula (13-7), known as Leibniz' s formula, will now
be illustrated by several examples.
Example 1. Evaluate --
da
(13-6) yields, when a^O,
/ log (x 2 + a 2 ) dx. Inasmuch as the limit* are constants,
J o
f log (x 2 + a 2 ) dx - f dx .
da Jo Jq x 2 -j- a 2
The resulting integral is easily evaluated by the fundamental theorem of integral calcu-
lus, since
:( 2ten_, 3
A.
dx 1
We thus obtain
2a
x 1 + a 2
d_
da
f log (x 2 + « 2 ) dx ® 2 tan 1 - I » 2 tan"*” 1 -•
yo « lo a
1 The reader can verify that
f 1 1 2x
r 1 1 2x f 1 1
lim / 2 ~T — 2 dx ^ I lim
o yo log a x a Jo « -♦ 0 log a
2x
log a x* -f
dx.
•See I. S. Sokolnikoff, “Advanced Calculus/’ pp. 121-122, McGraw-Hill Book Com-
pany, Inc , New York- 1939.
SEC. 13]
INTEGRALS WITH SEVERAL VARIABLES
,*K
263
x 2 dx , find <p'(x) first by evaluating the integral and then
o
differentiating, and also by the Leibniz rule. To avoid confusing the parameter x ap-
pearing in the limits of the integral with the variable of integration x, we write
V?(x) » f t 2 (it » ^7 3 I a
Jo 10
Hence *p'(x) «■ the other hand, the application of the rule (13-2) yields
*>'(*)
dx
F
Jo
t 2 dl = (x 4 ) 2
dx^
dx
thus checking the result previously obtained.
Example 3. Find d<p/da if <p(a) ~ / e~ xtla * dx Since the integiand and the limits
J a
in this integral are functions ot a, we use formula (13-7) Then
dip
da
/ -« 2x i
dx + t~ 4 (2) - e ~ at ( -2a)
- a <*
,2a 2 X* e _ xVal ^ 2e _ 4 2aC- aJ .
-/-« 2 a 3
The integral appearing in this expression cannot be evaluated in a closed form in terms
of elementary i unctions, but it can be readily computed in infinite series (see Chap 2,
Sec. 10)
Example 4 Formula (13-(V) can sometimes be used to evaluate definite integ“j,ls,
Thus consider
r l x a - 1
Y?(a) / - dx, a > 0.
7u log x
Differentiating under the integral sign, we get
f l x a log x
P (a) - I - dx - x a dx.
Jo log x Jo
The evaluation of the integial is easy, and we find
(13-8)
/(a) = —
1
Integrating again we get
+ 1 lo a + 1
<p(oc) = log (a + 1) + f.
(13-9)
To evaluate the constant c, we note that for a = 0, (13-8) gives <p{ 0) - 0 while (13-9) for
a « 0 requires that v»(0) log 1 + r. Hence c « 0 We finally imve
i
log X
dx =* log (a -f 1).
PROBLEMS
1. Find ,
da Jo
result by direct calculation.
r*72
if <p{a) - / sin ax dx by using the Leibniz formula, and check your
Jo
264
FUNCTIONS OF SEVERAL VARIABLES
[chap. 3
а. Find ~ if *(«) - f*( 1 - « cos i) s dx.
da Jo
5. Find ~ if ?(a) « f tan" 1 dx.
4a Jo or
4. Find if <p(a) «■ C tan (x — a) dx.
da Jo
б. Find if <?(z) *» f Vx dx.
oa? do
6. Show in the manner of Example 4 that
*►(<*) « / log (1 + a cos x) dx « ir log
Jo
1 +
"2
7. Differentiate under the sign, and thus evaluate
dx TV
a — COS X (a 2 — 1) H
U
dx
l
(a ~ COS x) 2
if a 2 > 1.
if a 2 < 1.
by using
8 . Show that
9. Verify that
f log (1 — 2a cos x -f a 2 ) dx « 0 if « 2 < 1
Jo
— V log a 1 if a 2 > 1.
1 r*
y » ~ I f (a) sin k{x — a) da
k Jo '
is a solution of the differential equation
d 2 y
dx 2
+ k 2 y « f(x),
where k is a constant.
14. The Calculus of Variations. Physical lawn can often be deduced
from concise mathematical principle's to the effect that certain integrals
attain extreme values. Thus, the Fermat principle of optics asserts that
the actual path traversed by the light particle is such that the integral
representing the travel time between two points in every medium is a
minimum. Also a considerable part of mechanics can be deduced from
the principle of minimum potential energy, stating that the equilibrium
configuration of a mechanical system corresponds to the minimum value
of a certain integral related to the work done on the system by the forces
acting on it. For example, the shape assumed by a flexible chain fixed
between two fixed points is such that its center of gravity is as low as
possible. To say that the center of gravity is as low as possible is equivalent
to saying that the potential energy of the system is as small as possible.
The problems concerned with the determination of extreme values of
integrals whose integrands contain unknown functions belong to the calculus
teC. 14] INTEGRALS WITH SEVERAL VARIABLES 265
of variations. The simplest of such problems concerns the determination
of an unknown function y ~ y{s) for which the integral
I = / ‘ F(r,y,y') dx (14-1)
between two fixed points PoOoJ/o) and P\(x u yi) is a minimum. The
function F of the variables x, y, and //' s? dy/dx is assumed to be known.
If we imagine that the points P 0 and P t in the xy plane are joined by
a sufficiently smooth curve y = fix), tlien the substitution of y ® f(x)
and y f * f'(x) in the integrand of (11-1) yields the integral 1(f) whose
value, ordinarily, depends on the choice of the curve y = f(x). We ask
the question: What is the equation of the curve y = y(x) joining P 0 and
Pi which makes the value of the integral (14-1) a minimum? To be certain
that this question makes sense, it is necessary to impose some restrictions
on the integrand in (14-1) and to specify how the curves that enter in
competition for the minimum value of 1 are to be chosen.
W r c shall suppose that F(v,y,y f ), viewed as a function of its arguments
x , y, and y\ has continuous partial derivatives of the second order, and
we assume (hat theie is a curve y ~ y(x) with continuously turning tangent
that minimizes the integral. We then choose the competing family of
curves as follows, bet y ~ rj(x) be any lumtion with continuous second
derivatives which vanishes at the
end points of the interval (.r 0 ,Xi).
Then
M = 0, *tri) - 0 (14-2)
If a is a small parameter,
g(x) * y(x) + onf(x) (14-3)
represents a family of curves passing
through (x {h y 0 ) and (x u yx), since
the minimizing curve y = y(x)
passes through these points and
77(.r 0 ) =■ rj(xi) ~ 0. The situation
here is that indicated in Fig. 10 The vertical deviation of a curve in the
family (11-3) from the minimizing curve is ar}(x)] it is called the variation
of y(x).
Now if we substitute y and y' from (14-3) for y and y' in the integral
(14-1), we get a function of a:
1(a) - ( 1 F\x , y(x) + arj(r), y'(r) + ot7j'(x)] dx, (14-4)
J *1
*» 0, Eq. (14-3) yields y(x) = y(x), and since y - y(x) minimises
For a
FUNCTIONS OF SEVERAL VARIABLES
266
[chap. 3
the integral, we conclude that 1(a) must have a minimum for a*0, A
necessary condition for this is
dl
da a®=o
- 0.
(14-6)
We can compute the derivative of 1(a) by differentiating (14-4) under the
integral sign and get
V(a ) « r i — F(x,Y,Y')dr t (14-6)
Jx o da
where we have set
y = y(x) + ai?(x), Y’ = y'{x) + ay'(x).
But by the rule for the differentiation of composite functions,
dF(x,Y,Y')
(14-7)
da
dF or o f or'
0 F da d)' 1 da
dF
or
dF
v(x) + n\x),
a }
so that (14-6) can be written as
OF
I 0Y
fT l OF
I'M */ -M') +
ri V
OF
or
; v'M
dr.
Since /'( 0) = 0 by (14-5), we get, on setting a = 0 m (14-8),
OF OF
<y
or or
I ~
■^o L OlJ On'
dr ~ 0,
(14-8)
04-9)
because for a = 0, it is evident from (14-7) that l r = y(x), Y' = y'(r).
The second term in the integral (14-9) can be integrated by parts to yield
r* 1 OF OF *i rx x (i / 0F\
/ — v’(x)dx = -- v (x) - *(*)-(--}
Jx o dy dy * 0 ] H dx \dy /
0/?
d /0F\
dx
= - f\(x) — (~F)dx
'*« dx \dy’/
dy'.
since the integrated part drops out because of (14-2). Accordingly, we
can write (14-9) as
/•*! [" dF d ( d F\~
\ v(x) ) dx = 0. (14-10)
J *t L dy dx \dy / _
But y(x) is an arbitrary function vanishing at the end points of the interval.
INTEGRALS WITH SEVERAL VARIABLES
SEC. 14}
267
Since the integral (14*10) must vanish for every choice of ij(x) y it is easy to
conclude that 1
dF d /dF\ _
dy dx \dy'/
(14-11)
This equation is called the Euler equation . On carrying out the differentia-
tion indicated in (14-11), we get the second-order ordinary differential
equation
dF d 2 F d 2 F d 2 F
<sy
dx dy' dy dy '
V - V
(14-12)
for the determination of the minimizing function ?/(.t).
The general solution of (14-12) contains two arbitrary constants which
must be chosen so that the curve y ~ y(x) passes through {x 0 ,yo) and
It should be noted that the solution of Euler's equation (14-11) may not
yield the minimizing curve because the condition (14-5) is necessary but
not sufficient for a minimum. Ordinarily one must verify whether or not
this solution yields the curve that actually minimizes the integral, but
frequently geometrical or physical considerations enable one to tell whether
the curve so obtained makes the integral a maximum, a minimum, or
neither.
Similar calculations when performed on the integral
h V) = f ' F{x,y,y',y",. ,.,y M ) dx (14-13)
J *0
yield the Euler equation
d d 2 d n
n - r + 7 , Fy (-D- — F yin) « 0. (14-14)
dx dr dx
The foregoing discussion (‘an also be generalized to the problem of minimiz-
ing the double integral
1 ^ = IL F ( x > y ’ u ’ Ux ’ v «) dx dy ’ (14-15)
in which the competing functions u(x } y) assume on the boundary C of
the region R preassigned continuous values u = <p(s). If it is supposed
1 The proof is by contradiction. Assume that the function in the brackets of (14-10)
is not zero at some point x « £ of the interval (xo,xi). Then since it is a continuous
d
function, there will be a subinterval l about x * £ throughout which F v — ~ F v > has
dx
d
the same sign as at x ** £. Choose ri(x) so that it has the same sign as F y — F v > in l
dx
and vanishes outside this subinterval. For such a choice of v(%), the integrand in (14-10)
will be positive, and thus the integral will fail to vanish as demanded by (14-10).
268
FUNCTIONS OF SEVERAL VARIABLES
[CHAP. 3
that F , viewed as a function of x,y f u,u x ss du/dx, u y m du/dy, has con-
tinuous second partial derivatives with respect to these arguments, the
Euler equation corresponding to the integral (14-15) turns out to be
F u
dF Uz __ dFu
dx dy
(14-16)
A special form of the integral (14-15) is of particular interest in the study
of the Diriehlet problem, which occurs in numerous applications. 1 It is
H «) = ff R l(u x ) 2 + («„) 2 + 2 /(*,?/)«] dx dy, (14-17)
where f(x,y) is a known function. The substitution of F = (u x ) 2 + {u v ) 2
+ 2 fu in (14-16) yields the Poisson equation
V 2 u - (14-18)
It can be shown that the solution of this equation, 2 assuming specified
continuous values u — <p(,s) on the boundary C of the region, actually
minimizes the integral (14-17) on the sot of all competing functions which
take on C the same boundary values <p(s).
Example' What is the equation of the curve y — y(x) for which the area of the surface
of revolution got by revolving the curve about the x axis is a minimum?
The integral to be minimized in this problem is
I « 2tt f yds * 2w f yVl -f y' 2 dx. (14-19)
Jx {) JjC(,
It has the form (14-1) with
F(x,y,y') =® 2iryV\ -(- y* 2 . ( 11 - 20 )
The substitution from (14-20) m the Kulei equation (14-11), after simple calculations,
yields
vT+ u’ 2 - d - /-£=-o,
dxVl -f y ' 1
or
yy" - v'~ - l - 0.
This second-order equation is easily solved by setting if « p, y" « p dp/dy (cf. Chap 1,
Sec. 13). The result is
X — f2
y - Cl cosh > (14-21)
ri
so that the desired curve is a catenary. The integration constants ci and c-j in the general
solution (14-21) must be determined so that the curve passes through given points (io,yo),
(XhVi)*
1 Bee Chap. 6, Sec. 12, and Chap. 7, Sec. 21.
1 See, for example, 1. S. Sokolnikoff, “Mathematical Theory of Elasticity,” 2d ed,,
sec. 106, McGraw-HilT Book Company, Inc., 1956.
SEC. 15]
INTEGRALS WITH SEVERAL VARIABLES
269
PROBLEMS
1. Show that the curve of minimum length joining a pair of given points in the plane
is a straight line. Hint: Minimize
r
J xn
VI + y*dx.
2 . Solve Prob. 1 by taking
dr .
3. When a bead slides from rest along any smooth curve C from the point P to a point
Q cm C, the speed v of the bend is v **■ \ /l 2 yh, where h is the vertical distance from P to Q.
f Q di
: - ■ Choose P at the otigin, and show that
Jp v
lienee the travel time from P to Q is i
the curve for which the travel time is a minimum is a cycloid.
r *\ %/ 1 ~j_ ,/2
4. Consider the integral 7 - / - -
Jxo ' v
the associated Euler’s equation is </ 2 E (r — r ^ c ». Discuss this solution.
6. Obtain Euler’s equation for the mtegial
dx, and show that the general solution of
Hit) = J \p{r)(t /) 2 4- <lU)ip d 2f(x)y]dx.
Special cases of this integral arise in the study of deflection of bars and strings
16. Variational Problems with Constraints. Occasionally one seeks a
maximum or minimum value of the integral
/ ® f 1 FUdJd/) dx, (15-1)
discussed in the preceding section, subject to the condition that another
integral
J =■ f * ('(-i',y,y') dx (15-2)
Jx Q
have a known constant value A physical problem of this sort has already
l>een mentioned in Sec. 11 where it was icquired to find the shape of the
chain which minimizes the potential energy while the length of the chain
is given. This is one of the so-called im perimetric problems of the calculus
of variations. 1
It is natural to attempt to solve the problem I — min subject to the
condition J — const by the method of Lagrange multipliers. We construct
the integral
I + XJ * [ 1 [F(x,y,y’) + \G(r,y,y')] dx
J *ts
1 Isoperimetric because the length (or the perimeter) of the curve is given.
(15-3)
270 FUNCTIONS OF SEVERAL VARIABLES [CHAP. 3
and consider the free extremum of the integral (15-3). The corresponding
Euler’s equation (14-11) is
d(F + \G) dd(F + \G)
= 0 (15-4)
dy dx dy'
and on carrying out the indicated differentiation 1 in (15-4), we get the
second-order ordinary differential equation containing the parameter X.
The general solution of this equation, in addition to X, will contain two
arbitrary integration constants. The integration constants and the param-
eter X must then be determined so that the curve y - y(x) passes through
the given end points and satisfies the constraining condition (15-2).
The justification of this procedure is based on an argument similar to
that used in Sec. 14, where instead of the one-parameter family of the
neighboring curves (14-3) one constructs a suitable two-parameter family. 2
16. Change of Variables in Multiple Integrals. The reader will recall
that the double integral ff R f( x >y) dA of a continuous function f(x,y) spe-
cified in a closed two-dimensional region R of the ry plane is defined as the
limit of the sum formed in the following way: The region R is subdivided
into n elements of area A A t1 and the value of f(x,y) is computed at some
n
point (ft,^i) of the AA t ; the sum is then formed, and its
limit is calculated when the number of elements A.4, is allowed to increase
indefinitely in such a way that the greatest linear dimensions of the ele-
ments tend to zero. Thus,
r n
I f(x,y) dA sb Wm ]£ /(£•.*».) (16-1)
n -♦ » i
The calculation of the limit in (16-1) is usually performed by repeated
evaluations of two simple integrals, so that
r rx~-h ry*~g t {x)
I K X >V) dA — I I f(x,y) dy dr. (1G-2)
J R Jx^a
The limits in (16-2) are determined from the equations of the boundary
of the region (Fig. 11). The triple integral of f(x,y,z) is defined similarly,
by subdividing the three-dimensional region R into volume elements A r t
and by forming the corresponding sum. Thus
f n
lf(x,y,z)dr= lim £/(£», At,. (16-3)
J R n — ♦ « , __ j
1 Compare Eq. (14-12).
•See G. A. Bliss, “Calculus of Variations,” Carus Monograph, The Open Court Pub*
fishing Co., LaSalle, IIL* 1925.
SEC. 16] INTEGRALS WITH SEVERAL VARIABLES 271
The limit of the sum in (16-3) is usually evaluated by repeated single in-
tegrations. One can write, for example,
f rx^h p/**h 2 (x) rz**g 2 (x,y)
}J(.x,y,z) dr = / / I f(r,y,z) dzdydT, (16-4)
JR ■'/rad Jy.^.h l (x) J z-^gi(x,y)
in which the integration limits are determined from equations of the
bounding surfaces.
The evaluation of multiple integrals can frequently be simplified by
making appropriate changes of the inde-
pendent variables. Thus in dealing with
double integrals, it may prove advanta-
geous to replace x and // by new variables
a and v related to x and y by the trans-
formation
h(i\y)
(16-5)
with suitable properties
We shall suppose that the functions
/, in (16-5) have continuous tiist partial derivatives in the region R and
that the J acob la n of (1 6-5 )
du
dv
dx
dx
du
dv
d//
dy
(16-6)
does not vanish in the region R. In this event, Eqs. (16-5) can be solved
for x and y to yield the different iable solution 1
X - <Pi(u,v),
y = <P 2 (w,iO-
(10-7)
If u and v are assigned some fixed values, say i/o and vo t the equations
wo = fi(*,y ) 9
*>o = h (x,y),
determine tw r o curves which will intersect in a point (xo,2/o), such that
X U = <Pi(Ua t Vo) r
y 0 = ^ 2 (wo,eo).
1 See, for example, J. S. SokolnikofT, “Advanced Calculus/’ chap. 12, McGraw-Hill
Book Company, Inc., New York, 1939,
FUNCTIONS OF SEVERAL VARIABLES
272
[chap. 3
Thus the pair of numbers (« 0 ,e o) determines the point (xo,yo), in the xy
plane (Fig. 12).
If u and v are assigned a sequence of constant values
(»2 (UsAl), •••, (Un,Vn)j ♦
a network of curves will be determined that will intersect in the points
Cri,2/l)> C**2d/2), C^lfe), * . (• Cn,Vn),
Corresponding to any point whose rectangular coordinates are (x,y) there
will be a pair of curves u = const and v — const, which pas* through this
point. The totality of numbers (v,v) defines a curvilinear coordinate
system, and the curves themselves are called the coordinate lines.
Thus, if
V - v?~+
* -l y
v = tan -»
x
the family of curves u = const is a
family of circles whereas v — const
defines a family of radial lines. The
curvilinear coordinate system, in
this case, is the ordinary polar
coordinate system (Fig. 13).
I 11 the cartesian xy coordinates
the element of area dA ~ dx dy is
the area of a rectangle formed by the intersection of the coordinate
lines x » x 0 , x = x 0 + dz, y » y Q , y = y 0 + dy, as shown in Fig, 14. In
the curvilinear uv coordinates the element of area dA can be visualized as
INTEGRATES WITH SEVERAL VARIABLES
273
SEC. 16]
the area of the quadrilateral PiP 2 P%Pa formed by the intersection of the co-
ordinate lines u = Uo, u » uq + du ) v = v 0 , v = v 0 + dv } shown in Fig. 15.
The expression for the element of area dA in curvilinear coordinates ( u,v )
can be calculated with the aid of Eqw. (16-7), but it is somewhat simpler
to follow the method of Sec. 2, Chap. 5 (see, m particular, Eq. (2-17) of
that section) to show that
dA = \J(u,v)\ dudv, (16-8)
where
dx
dy
dll
du
dx
dy
dv
dv
(16-9)
is the Jacobian of the transformation (16-7).
The double integral m (16-2) can then be evaluated in the uv coordinates
by substituting in }{x,y) from (16-7) to obtain f\<pi(u,v), <p 2 (w,e)] ss F(u,v)
and writing dA in the form (16-8). Thus,
f(x,y) dA ~ / F(u,v) \J(u,v) | dudv. (16-10)
J J v — a -In
The limits of integration in (16-10) are determined from the equations of
the boundary of R referred to the uv coordinates.
Similar considerations apply to a change of variables {x,y,z) in the triple
integral (16-4) by the transformation
U - fi(x,y,z),
v = h{x,y,z),
w = h{x,y,z),
( 16 - 11 )
FUNCTIONS OF SEVERAL VARIABLES
[CHAP. 3
with the Jacobian
J(x,y,z) =
If the solutions of (16-1 1) are
du
dv
dw
dx
dx
dx
du
dv
dll)
—
—
—
dy
dy
dy
du
dv
dw
dz
dz
dz
x —
<P\
(u,v,w),
y «
<P2
(u,v,w),
z =
*P3(U,V,W),
(16-12)
(16-13)
the element of volume dr in the uvw system ran be taken as 1
dr — \J(u,v,u>)\du dv du.',
(10-14)
where J(u,v,w) is the Jacobian of the transformation (16-13), so that
J(u,v,w)
dx
dy
dz
du
du
du
dx
dy
dz
dv
dv
dv
dx
dy
dz
dw
dw
dw
(16-15)
Example 1. Let it be required to find the moment of inertia of the area of the circle
(Lig. 16 )
h x 2 + y 2 ~ax~ 0
J^ydA-pdpdS about a diameter of the circle. It is con-
P~ a cos 0 vement to introduce the polar coordinates
x *» p cos 0 t
y * p gin 0,
so that the equation of the circle becomes
p « a cos 0.
Calculating the determinant J gives
_ _ j. I cos 0 sin 0 j
^ t ~ I — p sin 0 p cos 0 I Pi
dA ■* p dp dd.
Pig. 16
1 Cf. Chap. 5, Sec. 2.
so that
1 See Chap. 5, Sec. 1.
Fig. 17
276 FUNCTIONS OF SEVERAL VARIABLES [CHAP. 3
These equations are in the form (16-13) with p » u, 6 » v f <j> * w, and it is easy to verify
that (16-14) yields
dr ** p 2 sin 6 dp d$ d4>.
Consequently, the substitution from (16-16) gives
t 1 2
I x dr ** I / / p sin 0 cos </>p 2 sin 0 dp dd d<j>
J R j <t> »K»0 j 6 BaO J fi «»Q
ira 4 /16 3a
j* 3*3 SB
»ra 3 /6 8
Thus,
i6 ’
PROBLEMS
1. Use cylindrical coordinates (r,6,z) defined by
x = r coy y « / sin 0, z ~ z,
to compute the moment of inertia of the volume of a right -circular cylinder of height h
and radius a about its avis. Also evaluate the integral in cartesian coordinates
2. Compute the expression for dA in forms of u and v if
x *= u(l — v), y » uv.
3. Compute 4 the expression for dr in terms of u, v, and w if
x = u(l — v), y - uv, z =* uvw.
4. Show that in the cylindrical coordinator* ol Prob 1, the element of volume dr **
r dr dd dz,
6. Use the cylindrical coordinates of Piob 1 to find the volume enclosed by the circu-
lar cylinder r — 2a cos 0, the cone z *» r, and the plane z — 0.
6. Evaluate e~ ^ <v2; dy dx, where R is the region bounded by the circle x 2 ~f y- =
Jr
a z Use polar coordinates
7. Find the area outside p — a(l -f cos 0) and inside p *= 3 a cos 6
8. Find the cooidnuttes of the center of gravity of the area between p ~ 2 sm 0 and
p ® 4 sin 0 ,
9. Calculate the elements of area in the uv coordinate systems which are related to
the cartesian coordinate system ary by means ol the following equations of transformation:
(а) x s* u 4- a, y *» v + b;
(б) x * au, y ** bv;
( c ) x *■ a cos a — v sin a, y ^ u sin a + t> cos a;
where a, 6, and a are constants. Interpret your results geometrically
10 . What are the regions of integration in the uv coordinate systems of Prob. 9 if the
region R in the xy plane is the interior of the ellipse
+ ^
a 2 ' b 2
1?
11 . Discuss the curvilinear coordinate system defined by the relations
X «* U + V,
y as u — Vj
INTEGRALS WITH SEVERAL VARIABLES
SEC. 17] INTEGRALS WITH SEVERAL VARIABLES
and describe the region in the uv plane corresponding to the square x « 1, x ** 2, y
y « 2.
12. Discuss the curvilinear coordinate system defined by the relations
u « z 2 ~ y 2 , v » 2xy.
277
- 1 ,
Sketch the curves u «*■ const and t> » const.
13. Sliow that the attraction of a homogeneous sphere at a point exterior to the sphere
is the same as though all of the mass of the sphere were concentrated at the center of
the sphere. Assume the inverse-square law of force.
14. The Newtonian potential V, due to a body T, at a point P is defined by the equa-
tion V(P) * | wheie dm is the element of mass of the body and r is the distance
Jt r
from the point P to the element of mass dm. Show that the potential of a homogeneous
spherical shell of inner radius b and outer radius a is
V - 2ir(r(a 2 - 6*), if r < 6,
and
if r > a,
whore <r is the density.
16. Find the Newtonian potential on the axis of a homogeneous circular cylinder of
radius a
16. Hhow that the force of attraction of a right-circular cone upon a point at its vertex
is 2ir<rh{] — cos a), where h is the altitude of the cone and 2 a is the angle at the vertex
17. Show that the force of attraction of a homogeneous right-circular cylinder upon a
point on its axis is
2ira\h + Vl{* + a 2 - VOi'+h)* + a s J,
where h is altitude, a is radius, and It is the distance from the point to one base of the
cylinder.
17. Surface Integrals. A surface is usually defined as a locus of points
determined by the equation
^ = /(.rj/), (17-1)
where f(x,y) is a continuous function specified in some region of the xy
plane. Tins definition, however, is too broad to permit one to formulate
a meaningful concept of the surface area. Since most surfaces encountered
in applications are two-sided and piecewise smooth, we confine our con-
siderations to such surfaces only.
The surface defined by (17-1) is called smooth if it has continuous partial
derivatives dz/dx and dz/dy at each of its points. This implies that a
smooth surface has a continuously turning tangent plane and hence a
well-defined normal at each of its points. 1
1 We recall that the equation of the tangent plane to (17-1) at a point P(zo,lM)) is
“(£)/- x(,) + GX (!/ - vo) ’
> that the direction of the normal at P is determined by the ratios ( — ) : ( ~ ) : — 1
\dx/p \dy/j[>
(cf. Sec. 10, Chap. 4).
278 FUNCTIONS OF SEVERAL VARIABLES |CHAP.
The surface is said to he pieceurisc smooth if it can be subdivided b,
smooth curves into a finite number of pieces, each of which is smooth
Thus, the surface of a cube is piecewise smooth.
The surface is two-sided when it is possible to paint it with two differen
colors to distinguish the sides. 1 If two oppositely directed normals PI
and PN' (Fig. 18) are drawn at a point P of a smooth two-sided surfac
and P is allowed to move along any path that does not cross the edge c
the surface, the direction of PN can never be brought into coincidenc
with PN\
It is intuitively clear that a small element of a smooth surface is nearl
flat, so that a neighborhood of any point on it is well approximated by
portion of the tangent plane. This observation suggests a procedure fc
constructing a meaningful definition of the area of a smooth surface.
Thus, let S' be a smooth portion of the surface S bounded by a close
curve C (Fig. 19). We shall .suppose that S' is such that every line paralh
to some coordinate axis (say the z axis) outs S' in just one point. If th
projection C' of C on the xy plane encloses the region R , we can subdivici
R into n small subregions AR t by the families of straight lines parallel t
the x and y axes. The planes through these lines, normal to the region 1
cut from S' small regions AS' of areas Aa a . Let. AA t be the area of A R
The projection of Aa t on the xy plane is, approximately,
A.4 1 = cos 7 * A (7 2 ,,
where cos cos &, and cos y » are the direction cosines of the norm;
1 At first glance, it may appear that all surfaces are two-sided, but this is not the cae
A simple example of one-sided surface, whose boundary is a closed curve, is given
Sec. 6, Chap. 5.
SBC, 17 ] INTEGRALS WITH SEVERAL VARIABLES
N to 8 at a point (x,,^,*,) of &$[. Since 1
and
cos a t : cos & : cos y t ~
(-M-V 1
\dx/ % \dy/ l
cos 2 a x + cos 2 ft + cos 2 y % = 1
we have
cos y t =
-i
dr y/(dz/dx)l~ f ( dz/dijif + 1
279
Using the' positive 1 value for cos y t) which amounts to the choice of the posi*
tive direction of N, wc can write
A <r, == sec y t A A,
+ 1 *A t .
The surface area of S' can then be approximated by the sum
P,M + 0‘, +> ^
and we define the area a of S' by the integral
(17-2)
The integral (17-2) can be evaluated by repeated integrations to yield,
for example,
a
+ 1 dij dx.
By considering the projections IV and It" of S' on the other coordinate
planes, we deduce similar lormulas:
To obtain the surface area of a piecewise smooth surface we need merely
to add the areas of its smooth pieces.
The surface integral of a continuous function <p(x,y,z) specified on the
surface S' is defined as follows: Let S' be subdivided into subregions AS[
See the first footnote in this section.
280 FUNCTIONS OF SEVERAL VARIABLES
of areas Act* and form the sum
[chap. 3
2 <pfa>Vit z i) ( 17 - 3 )
t— l
where {x^y^Zi) is some point in AS[. The limit of the sum (17-3) as n — ► *>
in such a way that the greatest linear dimensions of the A S' % tend to zero
is the surface integral of <p(x,y,z) over S'. It is denoted by the symbol
f s , <p(*,y> z ) da. (17-4)
The integral (17-4) can be evaluated by repeated integrations. Thus, if
da = sec y dA
+ 1 dx dy,
then <p(x,y,z) da =* 1L <e{x,yj{x,y)) ^(— ) + + I dx dy
where z = f(x,y) is the equation of S' and R is the projection of S' on the
dz ~x
dx Vo 2 — x 1 — y 1
xy plane.
We shall consider surface integrals
in somewhat greater detail in Chap.
5.
Example 1. Find the surface of the
sphere x 2 4* V 1 4* & m a 2 cut off by the
cylinder x 2 — ox -f y 2 « 0 (Fig. 20).
From symmetry it is clear that it will
suffice to determine the surface in the first
octant and multiply the result by 4. Now,
dz ^ -y
dy \/ a 2 — x 2 — y 2
Thus, the integral becomes
«r
1 dy dx
V °* = S adydx
\/ a* — x* — y*
BBC. 17] INTEGRALS WITH SEVERAL VARIABLES 281
It is simpler to evaluate this integral by transforming to cylindrical coordinates. The
equation of the cylinder becomes r « a cos 0, and that of the sphere
Thus,
z » Vo 2 — x 2 — y 1 »» Vo 2 — r*.
or
ar dr dd
Example 2. Find the z coordinate of the center of gravity of one octant of the surface
of the sphere x 2 -f* V 2 + & m a 2 . Now,
/,- rr%/©’+©'+>
CBC
[ da
Js'
4 ir a 2
~8~
dj dy
a dz dy
_ a
^2 ~ 2 *
PROBLEMS
1. Find, by the method of Sec. 17, the area of the surface of the sphere x 2 -+* l/ 2 4*
z 1 ** a 2 Unit lies in the first octant.
2. Find the surface of the sphere z 2 -f- ?/ 2 + z 2 =» a 2 cut off by the cylinder x 2 az -f-
/ -* 0.
3. Find the volume bounded by the cylinder and the sphere of Prob. 2.
4. Find the surface of the cylinder x 2 -j- y 2 — a 2 cut off by the cylinder y 1 -j- z 2 *** a 2 .
6. Find the coordinates of the center of gravity of the portion of the surface of the
sphere cut off by the right-circular cone whose vertex is at the center of the sphere.
6. If a sphere is inscribed in a right-circular cylinder, then the surfaces of the sphere
and the cylinder intercepted by a pair of planes perpendicular to the axis of the cylinder
are equal in area. Prove it.
CHAPTER 4
ALGEBRA AND GEOMETRY OF VECTORS.
MATRICES
Fundamental Operations
1. Scalars, Vectors, and Equality 287
2. Addition, Subtraction, and Multiplication by Scalars 288
3. Base Vectors 291
4. The Dot Product 293
5. The Cross Product 294
6. Continued Products 297
7. Differentiation 299
Applications
8. Mechanics and Dynamics 202
9. Lines and Planes 306
10. Normal Lines and Tangent Planes 309
11. Frenet’s Formulas 311
Linear Vector Spaces and Matrices
12. Spaces of Higher Dimensions 316
13. The Dimensionality of Space. Linear Vector Spaces 317
14. Cartesian Reference Frames 321
15. Summation Convention. Cramer's Rule 324
16. Matrices 327
17. Linear Transformations 332
18. Transformation of Base Vectors 337
19. Orthogonal Transformations 340
20. The Diagonal ization of Matrices 343
21. Real Symmetric Matrices and Quadratic Forms 347
22. Solution of Systems of Linear Equations 350
285
It is desirable to treat directed quantities like force or velocity (which
are independent of coordinate* systems) witfiout reference to a set of co-
ordinate axes Such a coordinatc-fm? treat ment is made possible )>y the
analytical shorthand known us vector analysis. The trajectory of a
part id**, the dynamics of rigid bodies, and the theory of fluid flow are
readily studied by vector methods, as are also such topics as the geometry
of nines and surfaces. Introduction of coordinates yields a correspond-
ence between vectors and sets of numbers, and this correspondence permits
the use of vector methods in the study of linear equations. Such a study
leads to the concept of mainx , which has proved fruitful in a variety of
licit Is, ranging from circuit analysis to quantum theory.
FUNDAMENTAL OPERATIONS
1* Scalars, Vectors, and Equality. Some quantities appearing in the
study of physical phenomena can be completely specified by their magni-
tude alone Thus, the mass of a body can be described by the number
of grams, the temperature by degrees on some scale, the volume by the
numbci of cubic units, and m> on. A quantity that (after a suitable choice
of units) can be completely characterized by a single
number is called a scalar There are also quantities,
called vectors, that require for their complete charac-
terization the specification of direction as well as mag-
nitude An example of a vector quantity is the dis-
placement of translation of a particle. If a particle is
displaced from a position P to a new position P' (Fig
1), then the change in position can be represented
graphically by the directed line segment PP* whose
length equals the amount of the displacement and whose direction is from
P to P\ Similarly, a force of magnitude K dynes can be represented by
a line segment whose length is K units and whose direction coincides with
that of the force.
287
288
ALGEBRA AND GEOMETRY OP VECTORS, MATRICES [CHAP, 4
The initial point P of a directed line segment representing a vector is
called the origin , and the representation as an arrow suggests that the
terminal point be called the head of the vector. In many problems the
location of the origin for any given vector is immaterial, and in such
problems two vectors are regarded as equal if they have the same length
and the same direction. Such vectors, which need not coincide to be equal,
are termed free vectors . In mechanics, it is sometimes convenient to
specify vectors by giving the line of action as well as the length and di-
rection. Equality of these so-called sliding vectors means that the lengths,
directions, and lines of action coincide. Again, in the treatment of space
curves and trajectories one is led to specify the origin of the vector
as well as its length and direction. Such vectors are termed bound
vectors .
To distinguish vectors from scalars, boldface type is used for vectors in
this book. The length (or magnitude) of the vector A is denoted by | A | :
| A | = length of A. (1-1)
Equality is denoted by the usual symbol: A — B. For the most part this
chapter deals with free vectors, and hence “A = B” means that A and B
have the same length and direction.
2. Addition, Subtraction, and Multiplication by Scalars. If a particle
is displaced from its initial position P to P\ so that PP' = A, and sub-
sequently it is displaced to a position P", so that P'P" = B, then the
displacement from the original position P to the final position P" can be
accomplished by the single displacement PP" = C. Thus, it is logical
to write
Fig. 2
A + B - C
as the definition of vector addition (Fig. 2). In
words, if the initial point of the vector B is placed in
coincidence with the terminal point of the vector A,
then the vector C, which joins the initial point of
A with the terminal point of B, is called the sum
of A and B and is denoted by A + B =* C. This
is the familiar parallelogram law of addition used in
physics, and its extension to three or more vectors
is obvious. The symbol + behaves like the + of elementary algebra, in
that
A + B = B + A, commutative law
A + (B + C) = (A + B) + C, associative law.
(2-1)
BBC. 2] FUNDAMENTAL OPERATIONS 289
A proof is implicit in Figs. 3 and 4. The associative law enables us to omit
parentheses, writing A + B + C for A + (B + C).
It is desirable to give meaning to expressions like 5A, the product of a
scalar and a vector. In agreement with the meaning of multiplication
familiar from arithmetic, one defines
5A = A + A + A + A + A (2-2)
(Fig 5) and similarly in other cases. By a natural extension of this reason-
ing, /A is defined as a vector whose length is |/| | A| and whose direction is *hat
A
-*
5A
Fig. 5
of A if t is positive but opposite to that of A if t is negative. One defines At
by the equation At _ lA (2 . 3 )
It follows that 1A — A and also
s(tA) = (st) A, associative law,
( s + t ) A = sA + /A, distributive law, (2-4)
t(A + B) = tA + tB, distributive law.
A vector of zero length is denoted by 0 and termed the zero vector. To
introduce the idea of subtraction, one defines —A as the solution of the
equation A + X = 0. Evidently, —A is a vector equal in length to A
but of opposite direction, so that —A = ( — 1)A. As in elementary
algebra, B — A is used as an abbreviation for B + (—A).
Since the laws governing the addition of vectors and multiplication of
vectors by scalars are identical with those met in ordinary algebra, one is
justified in using the familiar rules of algebra to solve linear equations involv-
ing vectors.
290
ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [('HAP. 4
Example 1. The point P in Fig. 6 divides the segment AB in the ratio m:n. Express
R in lei ms of the vectors A, B and the scalars m, n.
A
— >
The vector X * AB satisfies A + X
hence, solving for X,
X
B by the definition of vector addition, and
B - A. (2-5)
(This exemplifies the so-called “head-minus- tail” rule for vector subtraction.) The vec-
— y
tor /IP is m/(m -f n) times X, by the hypothesis and by the definition of multiplication
>
by scalars. Since R — A -fr- AP we have, finally,
R
A + - X -(B - A)
rn 4- n
nA + mB
m -f n
Exa?nplc 2. Prove that the medians of every triangle intersect at a point two-thirds
of the way from each vertex to the opposite side.
Let two sides of tin' t mingle* Ik* specified by veetois A and B, as in Pig 7, so that the
third side is B — A (cf Example 1). The vector median to the side B — A is 1 ^ the diag-
onal of the parallelogram on A, B; hence this median is (A -f B)/2 (compare the special
case m « n of Example 1). If the point M in Fig. 7 is two-thirds of the way from the
vertex 0 to the side B — A, along this median, then
SKC. 3]
FUNDAMENTAL OPERATIONS
291
2 A + B A + B
°M -“ 8 —
The vector median to the side A is A/2 — B, again by the head-minus- tail rule If N
is at a point two-thirds of the way toward the side A on this median, then
Comparison with (2-(5) shows that the two points M, N coincide. That tiie third median
also lias the required behavior follows by interchanging the roles of A and B.
PROBLEMS
1. Sketch a vectoi A of length 1 5 in , paialicl to the lower edge of your paper and
having an arrow on its light-hand end Sketch a second vector B ol length 1 in., making
ari angle of 30° with A Now sketch 2A, 3B, A + B, A — B, 2A — 3B, (A -f B)/2
2. (live a condition on three veetoi.s A, B, C which ensures that they can form a tri-
angle (Generalize to n vectors A, B, C, . . , L
3. (liapbiealiv and algebraically, show how to find twm vectors A and B if their sum
S and difference D arc known
4. Sketch three vectors A, B, C issuing from a common point. On your figure show
the vectors A — C, B — A, C — B, and thus illustrate the algebraic identity (A—C)*r
(B - A) 1 (C - B) = 0
5. (a) Wnte down a vector ot unit length which has the same direction as a given
nonzero vector A {!>) Using t lie result (a), wiitc down a vector bisecting the angle formed
by two nonzero vectors A, B issuing from a common point
6 Show that a line fiom a ve? tex of a paiallelogram to the mid-point of a nonadjacent
salt' trisects a diagonal.
3. Base Vectors. Any vector A lying in the plane of two noncollinear
vectors a and b can be resolved into so-
called components directed along a and b.
Tins resolution is accomplished by con-
structing the parallelogram whose sides are
parallel to a and b (Fig S). Then one can
write
A = xa + yh,
where x and y are the appropriate scalars.
If three noncoplanur vectors a, b, and c are given, then any vector V
can be expressed uniquely as
V = xsl + yb + zc } (3-1)
where V is the diagonal of the parallelepiped whose edges are xa, yb } and
zc (Fig. 9). The vectors a, b, and c are called the base vectors , and the
scalars x, y, and z the measure numbers.
An important set of base vectors, denoted by i, }, and k, consists of unit
vectors directed along the positive directions of the x, y, and z axes, re-
a xol
Fkj. 8
292
ALGEBRA AND GEOMETRY OP VECTORS* MATRICES [CHAP* 4
spectively (Fig. 10). It is assumed that the system of axes is a right-
handed system; that is, a right-hand screw directed along the positive
z axis advances in the positive direction when it is rotated from the positive
x axis toward the positive y axis through the smaller (90°) angle. Because
i, j, k are mutually orthogonal, the representation
A a xi + y] + zk
yields the important formula
| A |* «** + ** + ** (3-2)
by use of Pythagoras's theorem (Fig. 11).
Example: If A = 1 + 2j + 3k and B =» ~j -f- 4k, compute the length of 2A — B.
Since 2A - B « 2i -f- 5j -f 2k, we have
|2A - B| - (2® + 5 2 «f 2*)« - V33.
SEC. 4]
FUNDAMENTAL OPERATIONS
PROBLEMS
1. (a) In the form oi 4* 4* ck write down two vectors of length 5 parallel to the y
axis, (b) If A * 14* 2j + 3k, B * i + j 4* k, C » i - k, compute A + B, (A + B) *f
C, B + C, and A -f (B 4- C). What law does this illustrate? (c) In (6), find 5A, — 2A,
the sum of these vectors, and the vector 3A. What law does this illustrate? (d) Also
find 3A, 3B, the sum of these vectors, and the vector 3(A 4- B). (c) In (6), a certain
vector D is such that A, B, D can be placed head against tail to form a triangle. What
is the z component of D?
2. Sketch the triangle with vertices at the heads of i 4* j 4* k, 2j + k, and 2i -f j,
and make the sides into vectors with head against tail. Find the vectors forming the
sides of the triangle, and verify that; the sum of these vectors is zero.
8. Draw a figure illustrating the inequality |A + B| < | A | 4* |B|, and by combin-
ing this with (3-2), deduce an algebraic inequality. Can you give a purely algebraic
proof?
4. (a) I .»et A, B, C, ... be vectors from the center to the vertices of a regular decagon
(ten-sided polygon). By choosing a suitable basis i, j and using symmetry, show that
the sum A ■+• B + C + •*• is zero ( b ) By picking another basis i', j', with i' making an
angle 6 with A, deduce the identity cos 9 4* cos (9 4* ir/5) 4* cos (9 4* 2r/5) 4 h
cos (9 4" Or/5) ** 0.
4. The Dot Product. The dot product 1 of two vectors is defined to be
the product of their lengths by the cosine of the angle between them. In
symbols,
A-B - | A] | B [ cos (A,B), (4-1)
where (A,B) is the angle from A to B. Thus A-B is a scalar , not a vector.
Geometrically,
A-B = | A j X (projection of B on A)
«= | B j X (projection of A on B). (4-2)
Evidently (A,B) can be measured in several ways. However, since cos d
» cos (— 6 ) = cos (2r — 6 ), these different measures all yield the same
value for A*B. The fact that cos 0 =*• cos ( — 6) also yields
A*B = B-A, commutative law, (4-3)
and one easily verifies the additional properties
(£A)-B = f(A-B), associative law, (4-4)
A*(B + C) = A-B + A-C, distributive law. (4-5)
For proof of (4-5) use (4-1) to transform (4-5) into
| A | | B 4* C | cos yp » | A | | B | cos 4> + I A | | C | cos 9, (4-6)
where the angles are defined in Fig. 12. Now (4-6) follows from
| B 4- C | cos $ « | B | cos 4> 4- I C | cos 9 (4?-7)
1 The terms scalar product and inner product are often used.
294
ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
and (4-7) is evident from Fig. 12* when the vectors are coplanar and the angles are in
the first quadrant. In view of (4-2) the property amounts merely to the assertion that
projections are additive, and the extension to arbitrary angles is not difficult,
For the mutually orthogonal unit
vectors i, j, k introduced in Sec, 3
we have, by inspection of (4-1),
i-i = j*j = k-k = 1,
(4-8)
i*j - J*k - i*k - 0.
Fin. 12 Hence, expanding the product by
(4-4) and (4-5), we get
(ri 4- ?yj + 2 k)-(.rji + /yj + z{k) = xx x -f yy x + zz x . (4-9)
By (4-1) and (4-0) the dot product gives a simple way to find the angle
between two vectors and, in particular, to decide when two vectors are
perpendicular. Indeed, if we agree 1 to regard the zero vector as perpen-
dicular to every vector, then from (4-1)
A-B = 0 if, and only if, A JL B. (4-10)
The case in which B is parallel to A is also worthy of note,
when B = A we have . . , . l2
In particular
( 1 - 11 )
Example. Compute the cosine of the angle between A and B if A **
B « — i zk, and find a value of z for which A ± B.
We have A-B «-1-f0+2z-22-l and hence, by (4-1),
cos (A,B)
2 2 - 1 2z -J
JaFTbT "
The result is zero, and hence tin* vectors are perpendicular, when z =
+ j + 2k,
PROBLEMS
X. Given A = i 4- 2j -j- 3k, B — — i 4- 2j 4 k, C = 2i 4~ j- («) Find the dot prod-
uct of 3i 4* 2j 4- k with ea^h of these vectors (h) Find A-B, A-C, B 4 C, A-(B 4- CL
What law is illustrated? (c) Find 2A and (2A)-B Compare A-B as found in (/>). (d)
Find the angle between A and B (c) Find the projection of A on C (/) Find a scalai s
such that A 4- tfB is perpendicular to A. (g) Find a vector of form ix 4~ j y 4“ k which
is perpendicular both to A and to B
2. (a) Show' that i -p j 4- k, i — k, and i — 2j 4- k are mutually orthogonal. (6)
Choose x f y, and z so that i 4- j 4" 2k, — i 4~ 2k, and 2i 4- *rj 4- yk are mutually
orthogonal.
3. (a) If A-B = A-C for some A 5^ 0, is it necessary that B » C? Illustrate your
answer by an example. (6) If A-B = A-C for every A, is it necessary that B * C?
5. The Cross Product. Besides the multiplication just considered there
fs a second kind of multiplication, which yields a product known as the
FUNDAMENTAL OPERATIONS
295
EEC. 5]
vector product or cross prodvct. The cross product of A and B, denoted
by A x B, is a vector C which is normal to the plane of A and B and is
so directed that the vectors A, B, C form a right-handed system. The
length of C is the product of the length of A by the length of B by the
sine of the smaller angle between them:
|A x B| ~ j A | | B | sin (A,B). (5-1)
The expression (5-1) represents the area of the parallelogram having A, B
as adjacent edges (Fig. 13). The student is warned, incidentally, that
(5-1) does not give A x B; it gives the length |A x B' only.
Since rotation from B to A is opposite to that from A to B, we have
A xB = — B x A, (5-2)
so that the commutative law does not hold for vector products. On the
other hand it is the case that
(/A) x B = /(A x B), associative lav, (5-3)
A x (B + C) ~ A x B + A x C, distributive law. (5-4)
The proof of equation (5-3) is trivial, and (5-4) i.-. readily established if we note that
A X V is obtained from the arbitrary vector V by performing the following three opera-
tions 0 , illustrated in Kig 1 1
Oil Project V on the plant* perpendicular to A to obtain a vector \ L J_ A of mag-
nitude | VI sin (A,Vj.
Og: Multiply Vi by |A| to obtain V 2 1 A of magnitude |A| |V| sin (A r V).
O 3 " Rotate V 2 about A through 00° to obtain V.-s = A x V.
It is easily chocked that each of these operators is distributive; that is, 0 t (B -f C) **
O t B -f OjC for all vectors B and C Hence the composite operator O 3 O 2 O 1 is distributive;
namely,
0 3 O 2 Oi(B 4~ C) * 0,i0v(0iB 4- OjC), since Oi is distributive,
= OziO^Oi B 4~ O 2 O 1 C), since O 2 is distributive,
— O 3 O 2 O 1 B 4 O 3 O 2 O 1 C, since O 3 is distributive.
Because 0/H ) iV ** A X V for every vector V, the latter equation yields (5-4).
296 ALGEBRA. AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
The definitions of vector product and of i, j, k lead to
ixi=*jxj*=kxk = 0, ixj=— jxi-»k,
j x k * — k xj = i, k x i «= -i x k - j. (5-5)
If A and B are given by their components as
A = xi + y] + zk, B = Zji + j/jj + z^
then expansion by means of (5-3) and (5-4) and simplification by means of
(5-5) yield
A x B = i (yz, - zyi) + j(x x z - zz,) + k(xyi - yx x )
which may be written as a determinant 1
A x B
i j k
= i
y z
X z
+ k
x y
X y z
- j
Vi zi
Xi Zi
2-1 2/1
xi y i z i
(5-0)
Example . Find a vector perpendicular to i -f 2k and i -f j — k, and fmd the area of
the triangle with these two vectors as adjacent sides.
Both questions are settled by calculating the cross product. We have, from (5-6),
(i + 2k) X (i + j - k)
i j k
1 0 2
1 1 -1
0
1
- ~2i + 3j + k.
This vector is perpendicular to the given vectors The area of the triangle is half the
area of the parallelogram:
Area A*=3^l~2i + 3j+k| «* %\/l4.
PROBLEMS
1, Given A - i + 2j + k, B - 3i + 2j, C - ~i + j + 3k. (a) Find A x B, A x C,
Ax B+Ax C, B + C, and A X (B + C). What law is illustrated? ( b ) Find a vector
perpendicular to B and C, and verify your answer by use of the dot product, (c) If A,
B, C have their origins at a common point, find a vector perpendicular to the plane in
which their heads lie. (d) Find the area of the triangle formed by the heads in (c).
2. Show that the cross product for each two of the following vectors is parallel to
the third: i 4" j k, i — k, i — 2j 4* k. What does this indicate about the vectors?
, 1 The reader unfamiliar with second- or third-order determinants is referred to Ap-
pendix A.
SEC. 6] FUNDAMENTAL OPERATIONS 297
3 . Give an example of three unequal vectors such that the cross product of any two is
perpendicular to the third.
1 If A X B * 0 and A«B « 0, is it necessary that A * 0 or B » 0?
6. In refraction at the plane interface of two homogeneous media let A, B, C be unit
vectors, respectively along the incident, reflected, and refracted rays, and let N be the
unit normal to the interface, (a) Show that the law of reflection is equivalent to A X II
■BXN. ( b ) Show that the law of refraction is equivalent to r^A X N «* » 2 C X N,
where n\ and are the indices of refraction.
6. Continued Products. With the two multiplications previously de-
fined, we can form the products (A*B)C, A*(B x C) and A x (B x C);
some of the other possible combina-
tions, however, have no meaning.
For example, (A^B) x C is mean-
ingless because the two factors in a
cross product must both be vectors.
The first product, (A*B)C, denotes
simply the product of the scalar
A'B with the vector C and may be
dismissed without further comment. Fig. 15
By definition of dot product, the
second expression, A* (B x C), called the scalar triple product , has the value
A*(B x C) - | A | cos 0 1 B x C|, (6-1)
where 0 is the angle between A and B xC. Since B x C is perpendicular
to the face of the parallelepiped containing B and C (Fig. 15), and since
|B x C| is the area of this face, (6-1) shows that A • (B x C ) represents the
signed volume of the parallelepiped having A, B, C as adjacent edges. More-
over, we have the formula
A X A y A z A — i A x 4' $A y -f* kA|,
A-(B x C) * B x B y B z , B = \B X + j B v + k B t , (6-2)
C x C y C z Q = iC x 4~ jCy + k C M>
as will now be seen. The expression (5-0) yields
i j k
B x C - B x B y B t » iP + jQ + k R, (6^3)
C x C y C z
say, where P, Q, R are certain second-order determinants. Taking the
dot product of iA x + \A V + kA* with (6-3) leads to
A*(B x C) *= A X P "4* A y Q *4- A z R f
which is the expansion of the determinant (6-2) on elements of the first row.
298
ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
Since interchanging two rows of a determinant merely changes its sign,
(6-2) yields the useful relations
A*(B xC)« B*(C x A) = C* (A x B)
- — B* (A x C) - — A* (C xB)= -C(B x A). (6-4)
These results as to magnitude are evident from the volume interpretation,
though further discussion is needed to establish the algebraic sign in this
way. Because of (6-4) it is customary to write
A-(B xC)= A*B x C - (ABC). (6-5)
To evaluate the vector triple product A x (B x C), let i be a unit vector
parallel to B and j a unit vector perpendicular to i in the plane of B and C.
Thus
B = B x \, C = Cxi + Cyj, A = A x i + A v ] + A z k, (6-6)
where k is a unit vector perpendicular to i and j. so oriented that the three
form a right-handed system. Since B x C = B x C y k by (6-6) and (5-6),
we have
A x (B x C) = -A x B x C y } + A y B x C v x
* (A X C X + A y C y )B x i - A x B x {CA + C y j)
~ B(A'C) - C(A-B). (6-7)
Example' Establish the identity
! A'C BC I
(A x B) • (C x D) = j A .p b-D I'
The expression is the scalar triple piodurt of A x B, C, and D. Interchanging the dot
and cross, as we may by (0-4), wo obtain
(A X B)*C X D * (A X B) x C D - |(A C)B - (B C)AJ*D
- (A-C)(B-D) - (B-Cj(A-D), (6-9)
since (A X B) x C « - C X (A x B) - (A-C) B - (B-C) A by (6-7).
PROBLEMS
1* Verify (6-2), (6-7), and (6-8) by direct calculation for the special case A ** i 4* j,
B ** ~i + 2k, C « j + 2k, D - i + j + k.
2. (a) In Prob. 1 find the volume of the parallelepiped having A, B, and C as adjacent
edges. (6) Find x such that the vectors 2i -f j — 2k, i -f- j -f- 3k, and ri -f j are coplanar.
B%nt. A certain parallelepiped must have zero volume, (c) State a simple necessary
and sufficient condition that three arbitrary vectors A, B, C be coplanar. (d) Evaluate
(AAB) and (ABA), where A, B are arbitrary.
$. By (6-7) show that A x (B x C) + B x (C X A) + C X (A X B) « 0.
4. Show that (B X C) X (C X A) « C(ABC), and deduce
(A x B)-(B x C) x (C X A) * (ABC) 2 .
SEC. 7] FUNDAMENTAL OPERATIONS 299
6. The vectors A, B, C issue from a common point and have their heads in a plane.
Show that (A X B) + (B X C) ~f~ (C X A) is perpendicular to this plane.
7. Differentiation. If for each value of a scalar t a vector R(/) is defined,
we say that R is a vector function of t. In a particular problem t may
denote the time and R the position vector of a moving point relative to
some origin ( ). As in the calculus of scalars, we say that R (t) is a contin uous
vector function of t at t = t 0 provided that
lim R (t) = Rfo). (7-1)
t - 1 0
The precise meaning of (7-1) is that |R(f) - R(/<>) | becomes as small as
desired whenever t is sufficiently near t {) .
The cartesian components of the vector R (t) are functions of t, so that
one may write
R(0 “ hr (t) + MO + kz(t). (7-2)
It follows from (7-1) that the functions x(t), y(t), z(t) are continuous if,
and only if, R(/) is continuous.
We define the derivative of R(0 with respect to t by the formula
(7R R (t + AO - R(0
-- - lim (7-3)
dt At -+ o A t
The substitution of (7-2) in the definition (7-3) leads immediately to the
result that R is differentiable if, and only if, x } y , z are, and in that case
(JR
It
dx d // d*
i r + j T + k-:
dt
dt
dt
(7-4)
As in scalar calculus we shall write R'(<) for dR/dt y R"(0 for d 2 R/dt 2 , and
so on.
Products involving vectors are differentiated by the familiar rules of
elementary calculus, and the proof of these rules also involves only familiar
ideas. For example, the formula
dt
dB dk
(A x B) - A x — d x B
V dt dt
(7-5)
follows from
A(A x B) = (A + AA) x (B + AB) — A x B
= A x AB + AA x B + AA x AB
when we divide by At and let At — ► 0. Of course, the order of the factors
in (7-5) must be preserved, since the cross product is not commutative.
300 ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
A geometric interpretation of the derivative may be obtained as follows:
Let the vector R (t) be regarded as a bound vector with its origin at the
origin of coordinates. The head of R then traces out a space curve as t
varies (see Fig. 16). The vector
AR = R(f + AO - R(0 (7*6)
is directed along a secant of the curve, AR/A i is parallel to this secant,
and hence lim (AR/AO is tangent. Thus, the vector R'(0 is tangent to the
Space curve R = R(<) whenever R'(0 exists and R'(0 9* 0.
To interpret the magnitude jR'(0|, let s be the length of the curve
from the fixed point given by t *= ^ to the variable point given by t.
Assuming R'(0 ^ 0, we have AR 0 for small At > 0, and hence
As
At
As | AR|
I AR! At
As
IARI
AR
At
(7-7)
Since |ARj is the length of the chord, and since the ratio As/|AR| of
arc to chord 1 tends to 1, Eq. (7*7) gives
ds
dt
dR
~dt
(7-8)
when At —» 0. Thus, the vector R'(0 has magnitude |R'| « ds/dt where s
1 We assume that s increases with t; otherwise a minus sign is needed. The fact that
late) /(chord) -* 1 follows from the familiar interpretation of arc as limit of lengths of
inscribed polygons. It is also possible to take (arc) /(chord) I as one of the defining
properties of arc and proceed, as in the text, to obtain the formula (7-9).
FUNDAMENTAL OPERATIONS
BBC. 7]
301
is the arc length along the curve. If R '{t) is continuous, the arc is
explicitly by
* - £ |R'(0 1 dt = £ Viy? + (i V '? + W dt.
given
(7-9)
Introduction of s as parameter instead of t facilitates the study of space
curves (see Sec. 10).
In two dimensions the interpretation of R'(/) given here agrees with the results of
elementary calculus. Let a smooth curve C be represented parametrically by x ** x(t),
V * 2/(0, so that the slope is given by
Slope - ~
ax
dy/dt
dx/dt
x'
(7-10)
for x' 7 ^ 0. If the same curve is described in the form R « ix + jy, we have R' ■* ix' -f-
j y', and hence the slope of the vector R' is y'/x' In view of (7-10), the fact that R' w
tangent to the curve agrees with the fact that dy/dx is ike slope of the curve. The formula
ds/dt |R'| is also familiar; it states that
*.V5VTW ->/©■+ (*)’.
which liecomes ds 2 *» dx 2 -f dy 2 when squared and multiplied by (dt) 2 .
Physically, one may regard t as time, so that the head of the bound
vector R(f) gives the position of a moving particle at time t . Since the
velocity is defined to be V = R '(t), the foregoing result means that the
velocity vector is tangent to the trajectory and has magnitude equal to the speed
ds/dt i with which the particle is moving .
Example 1. The position of a particle at time t is determined by the bound vector
R(0 » if + j^ 3 4- k sin t.
Find a vector tangent to the orbit at time and find the speed of the particle at time
t «* 0 .
We have R'(0 « i + 3j t 2 4* k cos t, which is the required tangent vector. At t *» 0
the velocity is R'(0) « i + k, and hence the speed is ds/dt *= |R'(0) | *= s/2.
Example 2. If a differentiable vector R(l) has constant length, show that R' is per-
pendicular to R, and interpret geometrically.
From R*R ® const, differentiation yields R*R' -f R -R = 0, whence R'*R » 0.
Geometrically, if R is a bound vector of constant length, its head traces out a curve lying
on a sphere. The tangent to the curve is tangent to the sphere, hence perpendicular
to the radius vector. Thus, R' A R.
PROBLEMS
1. If R(0 « I2( + fit 2 + k f 3 , (a) find the derivative R'(0- (&) At the point (2,3,1)
find a tangent to the space curve which is traced out by the head of R when R is regarded
as a bound vector. Hint : The point (2,3,1) corresponds to t « 1. (c) If R(<) is a bound
302
ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
vector giving the position of a moving particle at time t, find the velocity and speed of
this particle at time I *» 1.
2. (a) Differentiate the vector R(0 » it + j sin t 4* k cos t, compute |R'(0 1, and sim-
plify. (b) If R(£) is a bound vector, find the length of the curve traced out by the head
of R as t varies from t « 0 to t » 2.
3. By writing A«B in component form and differentiating, deduce (A-B)' **> A'*B +
A*B'.
4. If Ro and A are constant, find a vector tangent to the curve described by the bound
vector R « Ro + At.
3. If R(/) is a bound vector giving the position of a moving particle at time t , the
acceleration is defined to be A * R "(/). Show that A is constant if R(0 ** Ro 4* Ri* 4-
R 2 i 2 , where Ro, Ri, and Rj are constant; vectors. Is the converse true?
6. Show that (ABC)' - (A'BC) + (AB'C) 4* (ABC '), when A, B, C are differentiable,
and write out in determinant form.
7. If R = A -f /(OB, where A and B are constant and / is twice differentiable, then
R r X R" - 0.
APPLICATIONS
8. Mechanics and Dynamics. The work IT done by a constant force F
producing a displacement S in the direction of F is |F| |Sj. More gener-
ally, if F makes an angle 6 with S, the work is |Fj |S| cos 0, and hence
W - F-S. (8-J)
Because of this equation the dot
product plays a central role in cer-
tain branches of mechanics.
To illustrate the application of
cross products, let the vector ft rep-
resent the angular velocity of a
rotating body; that is, let & be a
vector whose magnitude is the angu-
lar speed in radians per second and
whose direction is parallel to the axis
of rotation. The positive sense of
ft is chosen as that in which a right-
handed screw would advance if the
screw were rotated in the same di-
O rection as the body. Let R be a vee-
Fig. 17 tor locating any point P of the body
relative to some point 0 on the axis
of rotation. Tt is required to find the instantaneous velocity V of the
point P If the distance of P from the axis of rotation is a r then by Fig. 17
|V| - |ft |a - |ft| | R | sin (R,Q).
Moreover, V is normal to the plane of R and ft and is so directed that
SEC. 8] APPLICATIONS 303
fl, R, and V form a right-handed system. Hence,
V = 0 x R. (8-2)
The result is independent of the origin O, for if a new origin O x is chosen and P is
specified by a vector Ri from Oi, then
Ri - R + S,
where S is parallel to (see P'ig. 17). Hence Q x S = O, and therefore
QxR]«flx(R + S)*QxR + ftxs*flxR.
Another example from dynamics illustrates the compactness of vector
notation. Let 0 be a fixed point in a rigid body, and let a force F be applied
at a point R of the body, which is located by the bound vector R whose
origin is at (). The force F establishes a torque or moment T which tends
to rotate the body about an axis that passes through O and is normal to
the plane of R and F. The magnitude of T is given by
I T | - ! R | |F|sin(R,F).
In addition, R, F, and T form a right-handed system, so that
T - R x F. (8-3)
That Ihe choice of 0 is immaterial follows as in the discussion of (8-2).
Similarly one shows that F may slide along its line of action without
affecting the result ; that is, F may be regarded as a sliding vector.
To illustrate the use of (8-3), we obtain a formula for the so-called center of mass of a
system of mass points. The force on a point of mass m in a gravitational field is given
by mF, where m is the mass of the point and F is a vector specifying the strength of the
field at the point in question. We assume a uniform field, so that F is independent of
position. From (8-3)
(R - P) x mF (8-4)
represents the moment about the point 1 P of the gravitational force on a point of mass
m at It. If there are n points of masses m i, m 2 , . . ., m n located by the vectors R 1} R 2 ,
. . R n , respectively, the total moment about the point P due to all of them is
2(R» - P) X m t F (8-5)
It is desired to find a single mass point such that its moment (84) reproduces the
total moment (8-5) for all choices of F and P. Equating the moments (84) and (8-5)
leads to
[mP - 2m*P - mR + R»] X F « 0, (8-6)
after rearrangement. Since F is arbitrary in (8-6), the factor in brackets must vanish,
so that
P(m — 2/rq) = mR — Sm^R*. (8-7)
1 The vectors R, P, and R, are bound position vectors with a common origin for the
points /?, P, and R t , respectively.
304 ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
The fact that P is arbitrary in (8-7) now gives
to » Xim t to R « Sm,R t . (8-8)
Conversely, (&-8) ensures the validity of (8-7) and hence of (8-6) independently of F and P.
This discussion was carried out by equating moments only. Equation (8-8) shows,
however, that the total gravitational force is also preserved, since the mass of the point
equals the total mass of the collection.
The point R with position vector
m{Ri + m 2 R.2 + m n R n
R “ T “ T (&-9)
mi + m 2 -i 1 - m n
determined by (8-8) is called the center of mass. Evidently the collection
of points, regarded as a rigid body, wpuld balance about the point R as
pivot, for the moment (8-4) is zero when P = R, and hence the moment
(8-5) also vanishes.
Still another example of the use of vectors in mechanics is given by
Newton's laws. Relative to an origin 0, which is regarded as fixed, let
the position of a particle at time t be specified by the bound vector R (f).
The velocity vector V is dR/dt , as indicated in Sec. 7, and the momentum
vector is defined by
dR
M « mV = m — » (8-10)
dt
where m is the mass of the particle at time t. In this notation Newton's
second law of motion takes the simple form
F -
dM
dt ’
mi)
where F is the force on the particle at time t.
F -
If m is constant the result is
( 8 - 12 )
We shall use (8-10) and (8-12) to derive some interesting properties of the center of
mass. Suppose given n particles with masses m,- and positions denoted by R t (i ** 1,
2, ...» n), where each rm is independent of t. The total momentum of the system
satisfies
Xrrii
dRi d
dt dt
2m, R*
d ZrruRi
dt m
(8-13)
where to «* 2m, is the total mass and where R locates the center of mass [(8-9)]. Thus,
the total momentum of the system equals that of a single particle which has mass m and moves
with the same velocity as the center of mass of the system.
If (8-13) is differentiated with respect to t, there results
2F,
TO
d 2 R
dt *
When we let F* be the force on the t'th particle and use (8-12). Since internal forces can-
ed in pairs by Newton's law of equal and opposite reaction, the sum 2F, represents the
APPLICATIONS
SBC. 8]
305
total external foroe acting on the system. Hence the center of mast has the same accelera-
tion ae a particle of mass m acted on by a force equal to the sum of the external forces acting
on the system.
Example 1. Parallel farces F, — F of equal magnitude but opposite direction constitute
a couple. Find the total moment, and show that it is the same about every point
Let R be a vector from a given point 0 to a point P on the line of action of F, and Ri
to a point Pi on the line of action of — F (Fig. 18). The total torque is
RxF+RjX (-F)
(R-Ri)xF - (PiP) x F.
Since this is independent of 0, the result
follows. Notice that F and — F must be
regarded as sliding vectors (Sec. 1) rather
than free vectors, since the line of action
is fixed.
Example 2. A system of forces F* acting
at various points R% of a rigid body is such
that 2F« «■ 0. If the total torque about
one point is zero, then the total torque
about every point is zero.
From 2(Ro — R») X F* «* 0, say, we arc
to deduce 2(R — R*) x F» * 0. The two
equations may be written
He x (SFJ
R x (2F»)
Fig. 18
2R t x F t , (8-14)
2Rt x Ft. (8-15)
Equation (8-14) gives 2R» X F, * 0, since 2F t = 0, and (8-15) follows.
Example 3. The moment of the momentum vector M about a point is called the
angular momentum of the particle about that point. According to the principle of angu-
lar momentum, the rate of increase of angular momentum about a point equals the re-
sultant torque about that point. Show that this principle is equivalent to Newton's
law, F * dM/dt
If A is the angular momentum about the oi igin, then A » R X M, where R gives the
position of the point. Thus
dk
It
R x
R x
dM
dt
dM
dt
x M
+ V x (mV)
R x
dM
dt
The principle of angular momentum dk/dt
R
dM
dt'
R x F is therefore equivalent to
R x F. (8-16)
If this holds for every choice of origin, that is, for every R, then necessarily dM/dt » F.
Conversely, if dM/dt - F, then (8-16) holds for every R.
PROBLEMS
1, Given A •» i 4- 2j 4- k, B - 1 - k, C - 2i + j, with A, B having their origins at
a common point, (a) find the work done by a force A in a displacement B. (h) Find
306 ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
the work done in a displacement from the head of A to the head of B under a force C.
(c) Find the work done in the displacement A subject to simultaneous forces B and C.
2. In Frob. 1: (a) Find the torque about the origin of A due to a force C through the
head of A. (6) Find the torque about the head of A due to a force C acting through the
head of B.
3. In Prob. 1 : (a) If the figure formed by A and B rotates about A with angular veloc-
ity ft, find the velocity of the head of B. (b) Find the velocity of the head of A if the
figure formed by A and B rotates with angular velocity ft about an axis parallel to C
through the head of B.
4. Two coordinate systems have a common origin at all times, but the second has a
vectorial angular velocity ft relative to the first. Show that Vj *» V 2 -f (ft X R), where
V x and V 2 are the velocity vectors in the first and second systems of a point whose posi-
tion vector is R in the first system.
5. Show that the torque due to two couples is the sum of the torques.
G. Three points labeled 1, 2, 3 have masses 1, 2, 3 and positions 2i -f j 2k, i — k, 3j,
respectively, (a) Find the center of mass, (b) Find the total mass 2, 1, and their cen-
ter of mass. From this obtain, again, the center of mass for all three.
7. The vectors A, B, C, D, E give the positions of the vertices of a regular pentagon
as referred to an origin not necessarily in its plane. Show that their resultant is equal
to 5R, where R gives the position of the center Hint * Place a unit mass at each ver-
tex, and find the center of mass in two ways.
8. (a) Show that F*V represents the rate at which work is done on a particle moving
with velocity V under a force F. ( b ) When the mass is constant, show that
(d/dt)(m |V| 2 /2) - F-V,
so that the rate of increase of kinetic energy equals the rate at which work is done on
the particle.
9. Lines and Planes. If R is a bound vector with its origin at the origin
of coordinates, then the direction numbers x , y , z are the same as the
coordinates of the head of R, and one may speak indifferently of “the
point R“ or “the vector R.” This correspondence between vectors and
points enables us to use vectors in geometry. Here we consider the ge-
ometry of lines and planes, which is especially simple; the following sections
are concerned with general curves and surfaces.
Suppose we have given a plane through the point R<> and perpendicular
to the constant vector A. If the point R is in the plane, then R — R 0
is perpendicular to A, and conversely (Fig. 19). Hence the equation of
the plane is
(R — Ro)*A ~ 0. (9-1)
If D is the distance from the point Ri to the plane, then
D = | R x — Ro | | cos 0 1
1 Rr -RollAMcosfll ^ KR l ~ RoVAI
~ ~ |A| ~ TaT
(9-2)
where 6 is the angle between A and Ri — Ro.
Next, consider a line through the point Ro and parallel to a constant
vector A. If the point R is on this line, then the vector R — Ro is parallel
J3ML 9] APPLICATIONS 307
to A, and conversely (Fig. 20). Hence, the equation of the line is
(R — Ro) x A = 0. (9-3)
If I) is the perpendicular distance from the point Ri to this line, then
Fm 19 Fig 20
Tn (9-3) the fact that R — Ro is parallel to A may also be expressed
by writing
R - Ro - At,
where t is a scalar. Thus we obtain the equation of the straight line in a
parametric form,
R - Ro + At, -~oo < t < oc, (9-5)
which is often more useful than (9-3). It is left to the student to deduce
the cartesian equation by setting
Ro = a 0 i + Vo) + -ok, A =* ai + b) + ck
in (9-5) and equating components. Eliminating t yields the symmetric
form
x “Z° - y ~~ lJo _
a b c
which may also be found from (9-3).
Example 1. Show that every equation of form
ax -f by 4- cz + d » 0 a, b, c, d const
represents a plane with A » ai 4- hj + ck as normal, and conversely.
(9-7)
308 ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
If R is a general point and Ro a fixed point on the locus (9-7), then writing (9*7) in
vector form yields
R«A -fd ** 0, Ro-A -f d « 0.
Subtracting these equations we obtain (9-1), which shows that the locus is a plane. On
the other hand, (9-1) itself has the form (9-7), with d * — Ro-A, and hence the converse
is also true.
Example 2. Find the equation of a line which passes through the point i — j and is
parallel to the two planes x -f- y « 3, 2x 4- y -f 3* » 4.
The respective normals to the planes are i + j and 2i 4* j 4* 3Jk, and hence the line
of intersection of the planes is parallel to the cross product:
(i -f j) x (2i + j + 3k) * 31 - 3j - k.
Since this vector is parallel to both planes, it gives the direction of the required line,
and hence the equation is
R « i - j + (3i - 3j - k )t, ~co < t < oo.
PROBLEMS
1. (a) Find a vector normal to the plane x + 2y 4* 3i * 1. (b) Find the angle be-
tween this plane and the plane af + y + *+ 2*0. (c) What is the distance from the
point 3i 4* 2j -f k to the plane in (a)? ( d ) Show that the points i and — j -f k lie in
the plane in (a), (e) Find a vector lying m the plane in (a). Hint- Subtract the vectors
of (d). (/) Verify that the vector of (c) is normal to the normal found in (a).
2. (a) Find a vector parallel to the line R — i + k -f (i -f 2j 4- 3k) t. (b) If R »■ ix 4*
jy 4- kz in (a), find x, y, and z in terms of t. ( c ) In (a), find the distance from the point
i 4- 2j 4- 3k to the given line. ( d ) Show that the line ( a ) intersects the line R * 2k 4-
(Si -f 2 j -j- k)«. Hint' Equate the two expressions for R, and consider each component.
It will be found that all three equations are satisfied by s ~ f =» (<0 Find the inter-
section point in (d). (/) Find the point where the line in (a) intersects the plane 2x —
y 4- Sz «* 4. Hint: Substitute the result of (b) into the equation of the plane, find t,
then find R.
3. (a) Find the equation of the line common to the two pianos x 4- 2j/ 4* 4* m 1 t
x 4- V ** 3 in the form R * Ro 4- AL Hint: Let z » t, and solve for x and y in terms
of t. (b) Find a vector parallel to the intersection of the planes by use of the cross prod-
uct as in Example 2. (c) Verify that your answers to (a) and ( b ) are consistent, (d)
Find the equation of all planes perpendicular to both planes, (e) Write the equation
of the line which is parallel to both planes and passes through the point — 3i 4- k.
4. (a) In terms of t, find the square of the distance from the point i 4- 2j 4“ 3k to a
general point on the line R 3i 4- 2j 4" k 4- (i 4- j 4- k)f. (b) By differentiating, find
the t for which the distance is minimum and the minimum value, (c) Check by the
distance formula.
fi. In the form (9-5) obtain the equation of a line perpendicular to the plane i + v +
3z ■* 0 at the origin. At what point does this line intersect the plane y ** 3z -j- 1?
6* If the lines R ■■ Ro 4- At and R » Ri 4“ Bt are not parallel, then the perpendicular
distance between them is
|(R! -Ro) -AX B|
|AXB|
Hint: By a suitable figure show that the distance is the length of the projection of
Ri — Ro on the common perpendicular to the two lines.
me. 10] APPLICATIONS 809
10. Normal Lines and Tangent Planes. If a curve C:x « z(t), y * y(t),
z = z(t) lies on a surface which has the equation
u(x,y,z) * c, (10-1)
where c is constant, then
4^(0,y(0,z(0] s c (10-2)
identically in t. At a fixed point R 0 = ix 0 + jy 0 + k z 0 (Fig. 21) we dif-
ferentiate (10-2) by the chain rule (Chap. 3, Sec. 4) to obtain
du dx du dy du dz
dx dt dy dt dz dt
(10-3)
This may be written as
n*R'(0 - 0, (10-4)
where R(0 « ix(0 + j y(t) + k z(t) and where
du du du
n = i f j h k — at (x 0l y 0 ,z 0 ). (10-5)
dx dy dz
Since R'(0 is tangent to the curve C Ly Sec. 7, it follows from (10-4)
that n is normal to the curve C. And since this is true for every choice of C,
the vector n must be normal to the surface. The tangent plane is the
plane perpendicular to n at R 0 , and hence its equation is n* (R — Ro) = 0
by Sec. 9.
The assumptions which underlie the foregoing result are clear from the derivation.
We assume u differentiable (so that the chain rule holds), and we assume that not all the
partial derivatives are zero (otherwise n «= 0, and n does not determine a direction).
The analysis shows, then, that n is perpendicular to every differentiable curve R «* R(t)
which passe s through the point Ro and lies in the surface . It is this property that enables
us to consider n as “normal to the surface.”
310
AtrGEBRA AND GEOMETRY OF VECTORS MATRICES [CHAP. 4
To illustrate the use of (10-5) we find a normal vector and tangent
plane for the ellipsoid ^ + 2 y 2 + 3 z 2 = 12
at the point (1,2, -1) Since u = x 2 + 2 y 2 + 3 z 2 , the partial derivatives
are 2x, 4 y, and 62 Evaluating these at (1,2, — 1) and substituting in
( 10 - 5 ) give the normal vector
n == 2i + 8j — Ok
The tangent plane is perpendicular to n and contains the point (1,2,— 1)
Hence its equation is
x -f* 4y — 3z ~ 12,
as the reader can verify
Introduction of the tangent plane leads to a simple interpretation of the differential
(Chap 3, Sec 3) If the equation of a surface is given in the form z *= f(x,y ), then
f(x y y) - z * 0
and hence (10-1) holds with u{x y y z) « f(x,y) — z By (10-5) a normal is
n = l-- + j^~k (10-6)
Ox dy
so that the tangent plane has the equation 1
(x - J 0 ) % + ( V - Vo) “ * * — *o, (10-7)
Ox 0y
where df/dx and Of/Oy are evaluated at (To, Vo) If we set x — xq » At, y — Vo “ A y,
and z — 20 « Az in (10-7) (Fig 22), there results
Of Of
At d Ay = A z
Ox du
Fig 22
1 The values t, y, z in (10-7) refer to the tangent plane and must not be confused with
the values x, y, z on the surface z * f(x t y)
APPLICATIONS
311
SEC. 11]
The left-hand side is simply the differential df, and hence the differential for the surface
z » /Or, 2 /) equals the increment for the tangent plane. The definition of differentiability
given in Chap. 3, Sec. 3, now lias a simple intuitive meaning; namely, f{x,y) is differen-
tiable if, and only if, the surface z ®» f{x,y) u well approximated by its tangent plane .
PROBLEMS
1. By use of (10-5) find a vector normal lo the plane ax -f by -f cz -f d » 0. Com-
pare Sec. 9, Example I .
2. At the point (2,1,3) on the surface xyz — jc 2 -f 2 find (a) a normal vector, (6) an
equation for the tangent plane, (r) an equation for the normal line.
3. Show that the surfaces xyz ■» 1 and x 2 -} ip — 2 zr ** 0 intersect at right angles at
the point (1,1,1); that is, the tangent planes are perpendicular
4 . The two surfaces x 2 «f y 2 +2^6 and 2x l 3 <r 4 r = 9 intersect at (1,1,2).
Find the angle between the tangent planes at this point
6. In Prob 4 find a vector tangent to the curve m which the surfaces intersect. Hint •
The required vector is perpendicular to both normals
11. Frenet’s Formulas. It was .shown in See. 7 that the vector R'(/)
is tangent to the spare curve R = R(/) and has length |R'| — ds/dt,
where s is the are along the curve If (he parameter itself is equal to the
are, so that t = s and
then ds/dt = 1.
T
R - R( s ),
In this ease the vector
_ dR
ds
(ii-i)
(11-2)
is a tangent vector of urn l length.
From T*T — 1 we deduce that
(IT /ds is perpendicular to T (Sec. 7,
Example 2). Hence we may write
(IT
- - xN, x > 0, (11-3)
(Is
where N is a unit vector perpendi-
cular to T and where x is a scalar
multiplier. The vector N defined
by (1 1-3) is called the principal nor-
mal, and the scalar x is called the
curvature . The plant 1 of T and N
is termed the osculating plane . We define x =* 0 for a straight line.
If we introduce a third unit vector B defined by B - T x N, then the
system T, N, B forms a right-handed set of orthogonal unit vectors, analo-
gous to the vectors i, j, k introduced previously. By Fig. 23,
N xB = T, B x T = N, TxN-B.
/c
(11-4)
312
ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
The vector B is called the binormal; the figure formed by T, B, W is some-
times referred to as the trihedral associated with the curve.
Differentiating the relation B « T x N and using (11-3) give
B , = TxN' + rxN = TxN , + (*N) x N « T x N',
and hence B' is perpendicular to T. It is also perpendicular to B, since
B*B m 1, and therefore B' is parallel to N:
dB
— - rH. (11-5)
as
The scalar multiple r in (11-5) is called the torsion; it measures the rate
at which the curve twists out of its osculating plane. We define r = 0 for
a straight line.
To evaluate dN /ds, recall that N = B x T. Hence
N' « B xT' + B' xT = xB xN + rN x T (11-6)
by (11-3) and (11-5). When we use (1 1-4), Eq. (11-6) reduces to
dN
— =-*T-rB. (11-7)
ds
Equations (11-3), (11-5), and (11-7) are known as the Frenet-Serret for-
mulas; they are of fundamental importance in the theory of space curves.
By equating the lengths of the two vectors in (11-3) and recalling |N | = 1 we obtain
«-|T'|-|R"|, '-£• (11-8)
To get a similar formula for r we differentiate (11-3), obtaining
T" - *N' + x'N « *(-*T - rB) + *'N (11-9)
by (11-7). Hence, by (11-3), (11-9), and (11-4),
r X T" *xNX(-, 5 T- xrB + x'N) « * 3 B - * 2 rT,
since 5 X N » 0. Taking the dot product with T yields
T-T' x r « ~* 2 r. (11-10)
If we solve (11-10) for r, express x 2 in terms of R by (11-8), and express T in terms of
R by (11-2), there results
R'*R" X R'"
r « — — (11-11)
which is the desired formula. When R
give, respectively,
ir(s) + j y(s) + k*(«), Eqs. (11-8) and (11-11)
- (*")* + iv"? + (*")\
( 11 - 12 )
SEC. 11 ] APPLICATIONS 313
It can be shown that x(s) and r(«) determine the curve completely, apart from it* position
in space. 1
Since a smooth curve can always be expressed in terms of its arc as
parameter, the foregoing theory suffers no loss of generality by assuming
t — 8. In many physical problems, however, it is more fruitful to take
the time t as parameter, and this possibility is now to be examined.
Let R = R(/) give the position of a moving particle at time t f so that
the velocity is V = R'(f). With v — ds/dt we have 2
dR dR ds
V - - - = Tv,
dt ds dt
( 11 - 13 )
upon using (11-2). Since (11-3) gives
we get
d T
d T ds
z=s —
= xNr,
dt
ds dt
dV
dv
dT
dv
-
- T —
+ V ~“
= T —
dt
dt
dt
dt
+ *r 2 N
(u-ii)
upon differentiating (11*13) Hence the acceleration vector A = (IV /dt lies
in the osculating plain , its tangt ntial component has magnitude equal to the
linear acccU ration dv/dl , and its normal component has magnitude xv 2 .
This is a far-reaching generalization of the familiar results
Atanp’c'ntinl —
Anomial
r
for uniform motion in a circle of radius r.
Taking the cross product oi (11-13) and (11-11) with V replaced by R' we obtain
R' x R" - T X N = *c 3 B.
Hence, the direction of the binormal is given bv R' X R"
even w hen the parameter is
l rather than s Since B is a unit vector, v\e have
R' x R"
, d
B ~ -!
|R X R" |
“ dt
(11-15)
and similarly, the unit vector T is obtained from
m R '
d
" IR'f
It
(11-16)
Knowing B and T we find N from
N - B x T.
(11-17)
'Se®, for example, L. P. Eisenhart, “An Introduction
to Differential Geometry,”
sec. 6, pp. 25-27, Princeton University Press, Princeton, N.J., 1910.
1 In agreement with the results of Sec. 7, Eq. (1 1-13) expresses the fact that the veloc-
ity is tangent to the orbit and has magnitude equal to the speed.
314
ALGEBRA AND GEOMETRY OF VECTORS. MATRICES |CHAP. 4
These formulas enable us to compute the trihedral when the curve is given with an
arbitrary parameter i provided ds/dt > 0.
Example 1 . Find the equation of the osculating plane at t * 1 for the curve R =* t\ 4-
2( 2 j + t»k.
Differentiation gives
R'(l) » i 4~ 4j 4- 3k, R"(l) - 4j -f 6k
Hence by (11-15) the binormal B is parallel to
(i + 4j + 3k) X (4 j -f 6k) - 12i - 6j -f 4k.
The osculating plane is normal to B ami contains the point
R(l) - i -f 2j f k.
Hence its equation is G.r — 3 y 4~ 2z = 2, as the reader can verify.
Example 2 A curve is a plane curve if, and only if, the torsion is zero
If the curve is a plane curve, not a straight line, then the oscillating plane is well
defined and is the plane of the curve. Hence B is constant, and the toision vanishes bv
(11*5). Suppose, conversely, that the torsion is zero Then B is constant by (11*5),
ft&d therefore, using (11-2),
■^(B-R) = B-® - B-T = 0.
ds da
This gives B*R * const, which is the equation of a plane.
Example 3. Consider the circular helix (Fig. 21) with equation
R «* ia cos 0 -f ja sin 0 4“ kptf, a, p positive const. (11 -IS)
Here, the parametric equations are
x «* a cos ft,
y — a bin ft,
z — pft.
By (11-8) x ~ JR" |, wheie primes denote differentiation with respect to the arc param-
eter #. From (11-18)
dR — — i a sin $ dft + j a cos 0 dft -f k p dft ,
so that
ds 2 a* <£R*dR = (a 2 sm 2 0 T cos 2 ft -f p 2 ){dft) 2 = ( a 2 4- p 2 ) do 2
and therefore dft/ds ** 1/Va 2 4- p ? *■ h, say. It follows that
dR dR dft
ds dO ds
( — ia sm 0 4- yi cos 0 -j- k p)h.
d 2 R
ds 2
d 3 R
ds 3
d dR dft , „
» ( — jo eos 0 ~ sm ft)h \
dft ds ds '
dd 2 Rdft „ . . _
T" *7? ~r ** mu 6 — jtt COH 0M ♦
dft ds* ds
s&c , 11}
APPLICATIONS
315
Fio 21
On making use of loimula (11 -S) wo find
= (R' ‘R") a* (a 2 sitr 0 -f a 2 cos 2 0)/> 4 = « 2 /i 4 ,
that
Amnding to (11-12) tho toision is
a
*> , «> *
<r -f /F
- a sin 0 a (*ob 0 p
-a iosfl ~ a sin 0 0
a bin 0 —a cos 0 0
a 2 -j- p 2
If p a= 0, we got a circle of radius a by inspection of (11-18) In this case r * 0 because
the curve is a plane curve and * * 1 /a because the radius is alw ays equal to the constant
a. The behavior as p ~+ qo may be discussed similarly.
PROBLEMS
1. Given the curve 1 t(t) = i(t 2 — 1) -J- 2/j -f- ( t 2 + l)k. (a) Find a unit tangent at
t « —1. (b) Find the equation of tho normal plane at this point, (r) Find the length
of the curve from t « 0 to t * 1.
2. (a) If R(/) in Prob 1 represents an orbit, find the velocity and acceleration at time t.
(b) By use of (a) and Eq. (11-13), find the speed t> at time t (c) By use of (a), (6), and
(1 1-14) find the curvature x and the principal normal N at time t.
316 ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP, 4
3. If the components of R(0 are second-degree polynomials in t, then R *» R(0 is a
plane curve, (a) Prove this by use of (13-12) and Example 2. (b) Find the equation of
the plane.
4 . Show that (a) the tangents to a helix make a fixed angle with the axis of the helix,
(b) the principal normal is perpendicular to the axis of the helix.
fi. (a) Given a particle moving according to the law R(0 « it -f j t 2 , find a unit tangent
and a unit normal to the orbit at t » 1, (6) Find the cartesian components of V and A
at t «* 1. (c) By use of the dot product and (a), find the tangential components V t and
A t of V and A at t « 1 . (d) Find ds/dt as jR'(0 |, and from this find dh/dt 2 . Com-
pare (r).
6. In Prob. 5, (a) show that V n , the normal component of V, is zero, and find that of
A at t * 1 by use of the dot product and Prob. 5a. (b) By (a) and A n **■ * j V | 2 find the
curvature of the orbit at t «* 1. (c) Show that the cartesian equation of the orbit is
pi 5 , and compute the curvature by x » y f 7(1 + j/' 2 )**. Compare (b), (d) Explain
how to find A n in terms of A*, A„, and A<, and use this to check some of your work.
LINEAR VECTOR SPACES AND MATRICES
12. Spaces of Higher Dimensions. There is nothing mysterious about
the idea of spaces whose dimensionality is greater than three. In locating
objects in the familiar three-dimensional space of our physical intuition,
we have found it convenient to introduce a coordinate system and to
specify the location of any point in the object by means of three numbers
termed the coordinates of the point. Thus, if a cartesian system of axes
is introduced, we can associate with each point P an ordered triple of
labels ( x,y,z ).
In dealing with the state of gas determined by the pressure p, volume
v } and temperature T, it is often useful to visualize the triples of values
ip>v f T) as coordinates of points in three-dimensional space, but such a
visualization fails when the number of variables characterizing the gas-
state exceeds three. Thus, the state of gas may (and generally does)
depend not only on the pressure, volume, and temperature, but also on
the time t. Although a quadruple of values (p,v,T,t) cannot be represented
as a point in a fixed coordinate system in the three-dimensional space, the
geometric visualization is of much lesser importance than the analytic
apparatus developed for coping with the geometric problems. This ap-
paratus (analytic geometry and vector analysis) makes use of the tools
of algebra and analysis which involve operations on ordered sets of quanti-
ties such as (p^VjTf) or (xi,x 2 ,. . which are valid regardless of the
number of variables appearing in the set.
The habits of using the language associated with geometric thinking
are so strong, however, that it is natural to continue speaking figuratively
of r quadruple of numbers (p,v,T,t) as representing a point in four-di-
mensional space and more generally refer to an ordered set of n values
. . . ,z n ) as a point in n-dimensional apace . The values x h x 2) . . , , x n
LINEAR VECTOR SPACES AND MATRICES
317
SEC. 13J
may be of quite diverse sorts; the first three, for example, may be as-
sociated with cartesian coordinates of some point M in three-dimensional
physical space, x 4 may represent the magnitude of electric charge located
at M, Xs may stand for the time of observation, and so on. But whatever
meaning we choose to attach to the individual values x %y we can speak
of the n-tuple (x\,X 2 , . . ,,x n ) as representing a point P in n-dimensional
space.
In three-dimensional space we found it useful to associate with every
pair of points Pi and P 2 an entity P\P 2 which we called a vector a, and
we have developed a set of rules for operations with vectors which form
the basis for the algebra and calculus of vectors.
Although in the initial formulation of these rules we have been guided
by geometric considerations, we have distilled out geometry by giving a
set of algebraic laws (2-1), (2-3), (2-4), (4-3), (4-4), and (4-5) which govern
operations with vectors
We can continue using the suggestive language of three-dimensional
vector analysis and say that every pair of points P\ } P 2 in n-dimensional
space determines a vector a. We further stipulate that in devising the
rules for operating on such vectors we adopt the set of algebraic laws (2-1),
(2-3), (2-4), (4-3), (4-4), (4-5), which contain no reference to the dimen-
sionality of space, and we define the vector 0 by the relation a + 0 <* 0 +
a = a for every vector a.
The dimensionality of space, we recall, entered only when we made use
of these laws in those calculations which involved the representations of
vectors by components in special coordviaU systems . Thus in Sec. 3 we
considered a vector in the plane determined by a pair of noneollinear
vectors and introduced the notion of base vectors and the so-called com-
ponents of the vector along the base vectors. We also saw that a vector
in three-dimensional space can be represented uniquely in terms of its
components in the directions of three noneoplanar base vectors. These
remarks suggest that the dimensionality of space is in some way connected
with the number of base vectors needed to represent a given vector by
components. In providing a generalization of the representation of vectors
by components in spaces of higher dimensions, we need the notion of
linear dependence of a set of vectors which we develop next.
13. The Dimensionality of Space. Linear Vector Spaces. The concept
of linear dependence of a set of vectors a*, a 2 , . . a n is intimately con-
nected with the idea of dimensionality of space.
Definition. A set of n vectors ai, a 2 , . . a n is linearly dependent if
there exists a set of numbers a\ y a 2) . . . , a*, not all of which are zero , such that
aqai + a 2 & 2 H — * + <*na n ==* 0.
(13-1)
318
ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
If no such numbers exist , the vectors a i5 . . ., a n are said to be linearly
independent }
To get at the geometric meaning of this definition consider two vectors
a and b which are like or oppositely
directed (Fig. 25) Then we can find a
number k ^ 0 such that
Fro. 25
b - Aa. (13-2)
We can write this equation in symmetric form by setting k — —a/0, so
that (13-2) reads
<*a + 0b = 0. (13-3)
Since neither a nor 0 is zero, it follows from our definition of linear depend-
ence that two collmear \ectors are always linearly dependent Inasmuch
as every vector b directed along a can be represented! in the form (13-2),
formula (13-2) serves to define a imc-dimcnsional linear rector space . We
observe that every two vectors in such a space are linearly dependent.
If we consider two noncollinear vectors a and b (Fig. 2C>), then every
vector c in their plane can be represented in the form
c — Aqa -j- Aqb
(13-4)
Fig 20 Fig. 27
by a suitable choice of the constants A*i and A 2 . Equation (13-4) can be
written as
aa + 0b + 7 c - 0, (13-5)
in which not all constants a , 0 f y are zero. Formula (13-4) determines
every vector c in the plane of a and b, and it thus defines a two-dimensional
linear vector space , while formula (13-5) ensures that every three vectors
in the two-dimensional space are linearly dependent.
If we take three noncoplanar vectors a, b, c (Fig. 27), we can represent
every vector d in the form
d *= Aqa + A 2 b + A 3 c, (13-6)
1 Of. the definition of linear dependence of a set of functions in Sec. 21, Chap. 1,
SEC. 13J LINEAR VECTOR SPACES AND MATRICES 319
from which follows the relation
aa + 0b + 7 C + 5d = 0, (13-7)
in which a, 0, % 5 are not ail zero.
Equation (13-7) states that in a three-dimensional linear vector space
defined by (13-0), four vectors are invariably linearly dependent.
The foregoing discussion indicates a relationship between the dimen-
sionality of a vector space with the number of linearly independent vectors
required to represent any vector in one-, two-, or three-dimensional vector
space.
We generalize this relationship by saying that in an r -dimensional
linear vector space every vector x can be represented in the form
x = ki&i + A 2&2 d~ ■ * * + (13-8)
where a t> a 2 , a r , is any set of n linearly independent vectors. It
follows from (13-8) that in such a space every set oi more than n vectors
is linearly dependenl .
We shah call a given set of n lineally independent sectors the base
vectors (or the basts) of the //-dimensional linear \ eel or space, and we
shall term the numbers ikuk 2 , . . .,k n ) the measure numbers associated with
the basis a^, a 2 . • . • , a rl .
In Sec. 3 we noted that every vector V in three-dimensional vectorspaeo
can be represented uniquely by taking as a basis any set oi three linearly
independent vectors a, b, c. But we saw that a special M*t of mutually
orthogonal unit vectors i, j, k when used as a basis great h simplifies the
calculations. This suggests the desirability of representing a vector x
in the //-dimensional space in the form (13-8) m which the base vectors
a, are the analogues of the unit vectors i, j, k The construction of an
analogous set of base vectors requires the extension of the concepts of
length and orthogonality to sots of vectors in //-dimensional space. In
making these extensions we suppose that the scalar product a«b of a and
b is a real number, and that a -a > 0 unless a - 0 Further the operation
of scalar multiplication obeys the laws (1-3) and (1-3)
We recall that in three-dimensional space two vectors a and b are
orthogonal if a*b — 0
and a is a unit vector if
a*a - 1.
We extend these definitions to vectors in //-dimensional space and show
that when any sot of n linearly independent vectors ai, a 2 , . . a rt is given,
one can construct a new set of vectors e*, e 2 , . . ., e n , such that
e^e, = 0, if i j,
= I, if i ** j .
(13-9)
320 ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
A set of vectors satisfying the conditions (13-9) is called an orthonormal
$et
Let the set of vectors ax, a 2 , . a n be linearly independent, so that
the equation
+ <* 2&2 + • • • + a n a n = 0 (13-10)
can be satisfied only by choosing = a 2 = * • * *= a n = 0. It follows
from (13-10) that ai 0, for if it were a zero vector, the choice
ot\ " 1 , a 2 » 0 , . . a n = 0
would satisfy (13-10) and hence the vectors a* would be linearly dependent,
thus contradicting our initial assumption.
We shall write
a**a t S5 |a t | 2
and call |a»| the length of a,. Now denote the product of ai by the recip-
rocal of its length |ai j by ei, so that
ai
Since ei*©i = 1, ei is a unit vector. The vectors
e l i a 2> • • • > a n
are obviously linearly independent. Consider next the vector
e 2 = a 2 (a 2 *ei)ei.
The scalar product e 2 *ei is
e 2 * “ a 2*®i — ( a 2*®i) e i*®i * 0,
since e* is a unit vector. Thus e 2 is orthogonal to ei and the vector
t
^2
is a unit vector orthogonal to ©j.
The set of vectors
© 2 , a 3f . . a n
is linearly independent, and we construct the vector
e 3 — a 3 — (a 3 -ei)ei — (a 3 *e 2 )e 2
which is orthogonal to both ej and e 2 . The vector
t
e 3
is a unit vector, and the set of vectors
SBC. 14]
LINEAR VECTOR SPACES AND MATRICES
321
®1> ®4> • • •?
is a linearly independent set. We continue the process by forming
*4 *= ^4 — (a 4 *ei)ei — (a4*e 2 )e 2 — (a4*e 3 )e 3
which is orthogonal to e u e 2 , and e 3 , and normalize it by dividing it by
| e*| . The set of vectors
®2> . • •> Rn
is linearly independent, and a continuation of the procedure yields after
n steps the desired set of orthonormal vectors
©1> ®2) • • 'j
14. Cartesian Reference Frames. When the base vectors i, j, k of Sec. 3
are oriented along the xyz axes, the coordinates of their terminal points are
i: (1,0,0),
j : (0,1,0),
k: (0,0,1).
By analogy we can say that when a set of orthonormal base vectors ex,
e 2 , . . ., e„ is oriented along “a cartesian reference frame in n~dimensional
Euclidean space/' the terminal points of the base vectors have the co-
ordinates
e x : (1,0,0,..., 0),
e 2 : (0,1,0, . . .,0),
e 3 : (0,0,1,. . .,0),
e n : (0,0,0, . . . ,1).
In this reference frame every vector x has the representation
x « xie t + x 2 e 2 H b z»e n , (14-1)
where the x, are the components of x.
On making use of the distributive law of scalar multiplication, we find
that
x-x = x 2 i + a-1 H h xl, (14-2)
since e,-e ; = (14-3)
where the symbol 6, y, the Kronecker delta, means
Sn =1, if * - j,
* 0, if is* j.
322 ALGEfcRA AN!) GEOMETRY OF VECTORS. “MATRICES [CHAP. 4
From (14-2) we conclude that the length |x| of the vector x is given by
the formula
I X I = + ‘*2 4" * * 4“ *n *
This is the formula of Pythagoras in n-dnncnsional Euclidean space .
Also, if
y ~ !h e l + V' 2^2 + * • * f t/n^n, (14-4)
then on forming the scalar product x-y \\c find
x-y = X\\)i + x 2 y 2 4 h s n y n , (M-5)
which has the same structure as formula (1-9).
For the sum of two vectors x, y with components
x: (.r lT .r L >,. . .,x n ),
y: 0/l,//2,*. dJn\
we have the vector x + y with components
x + y: (a + y u x 2 + y 2 , . . x n + ?/n), (14-0)
and for the product of x by a scalar a,
ax: (axi,ax 2 ,' • (14-7)
If we have two vectors x and y in Euclidean three-dimensional space, we
have a useful inequality
(x-y) 2 < (x-x)(y-y) (14-8)
which follows directly from the fact that
cos 2 <9
(x-y ) 2
(x-x)(y-y)
< 1 -
We show next that the formula (11-8), known as the Cauchy-Schwarz
inequality, is valid in an n-dimensional Euclidean space
Indeed,
(x-y ) 2 (x-y) 2 ]
x*x — 2 4 — —
(x-x)(y-y) - (x-y ) 2 = y-y
y.y
y-y
— I y 1 2
x-y
y -
y.y
> 0 ,
which proves the inequality (14-8). We note that the equality sign in
(14-8) holds if, and only if, y = 0 or x = ay for some scalar a.
The formula (14-8) enables us to establish the result
I x + y| < |x| + |y|,
(14-9)
SEC. 14] LINEAR VECTOR BP A CEB AND MATRICES 323
analogous to the “triangle inequality” of Prob. 3 in Sec. 3. We compute
|x 4 y| 2 = (x + yMx + y) = xx 4 y-y + 2x-y
< |x| 2 4|y| 2 4 2|x-y|.
(14-10)
But from (14-8)
(x-x)(y-y) > |x-y| 2 ,
so tii at
Pl-|y|>|x-y|-
(14-11)
The substitution from (14-11) in (14-10) yields
!x + yi 2 <!x| 2 +!y! 2 + 2 |x|.|y| = ( |x| 4 |y ! ) 2 ,
and on extracting the square root we get the inequality (14-9).
In quantum mechanic and m several other branches of physics it is necessary to
considei oi tiered sets of romplex numbers t ri,a 2, • . Such sets can he viewed as
components of a vector x in an n-duneusionul complex vector spare . For the definition
of addition of two complex vectors x, y with components
x: f.ri,y 2 ,. • ■ ,?*),
y- (Vl,//‘>, P/n),
we can take formula (14-f>) and define the multiplication by a scalar a (real or complex)
by (14-7). To make the length ,xj of tin* complex vector x real, we adopt as the defini-
tion of scalar product of x and y the formula
x * y * jp/i 4 ?2U2 4- • • 4 itt.Vn, ( » 4-12)
in which J t denotes the conjugate of the complex number x t This foimula specializes
to (14-5) when the components of vectors are real, since for real numbers i x = x t . We
note from (14-12) that
y-x * Xl C/i 4 1'iin ^ h JTnVn,
so that x-y - y x,
since the conjugate of the sum of complex numbers is equal to the sum of their conju-
gates and the conjugate of the* pioduet is the product of the conjugates
Formula (11-12) yields
X X * XiX } 4 V2 H h JnXn, (14-13)
so that I z I =* Vx’X is a real number
The definition of linear independence of a set of complex vectors is that given in Sec.
13 wheic the constants ot t are now in the field of complex numbers.
PROBLEMS
1. If one starts with the definition of a vector x as an n-tuple of n real or complex
numbers (j*i,j- 2 ,- • .,2*n) and uses for the definition of sum and product the formulas
x 4 y: Ui 4 ?/j, -..,^4 I in),
kx: (kx i,. . .,Ax„),
n
x-y « 2 -V/*,
t-1
324
then
algebra and geometry of vectors, matrices [chap. 4
(x + y)-z « x-z + y-z,
x-(y + z) - x-y +X-X,
(kx)-y - £(x-y),
x-(ky) = k(x-y).
S. Prove that if a (l) , a (!> , .... a (n) is a set of n linearly independent vectors in a
complex n-dimensionai vector space, then the only vector x orthogonal to each of the
vectors a <l> is the zero vector.
3. Prove that a set of mutually orthogonal vectors is always linearly independent.
4. Modify the proof of orthogonalization in Sec 13 so that it applies to a set of linearly
independent complex vectors.
15, Summation Convention. Cramer’s Rule. In dealing with expres-
sions involving sums of quantities it is often useful to adopt the following
summation convention: If in some expression a certain summation index
occurs twice, we omit writing the summation symbol 2 and agree to sum the
terms in the expression for all admissible values of the index.
3
Thus in a linear form X a t x x the summation index i appears twice under
% «* i
the summation symbol 2, and we shall write o,x t to mean OjXi -f a 2 x 2
3
+ a 3 x 3 . The symbol X “ a n + a 23 + <233 will be written simply
t«* 1
as a u . Again, a double sum
3 3
X £ Oq*»*y * <*11* 1*1 + <>12*1*2 + 013X1X3 + o 2 iX 2 x, -f 022*2X3
+ 0 23 X 2 X 3 + 031X3X1 4* O32X3X2 + 033X3X3,
which has two repeated summation indices i and j under the summation
symbols, will be written as
OjjXjXj.
The range of admissible values of the indices, of course, has to be specified.
Thus, the expression
Ji- 1,2.3,
a,jXh l j = 1, 2, 3, 4
represents three linear forms
OnXi + Oi 2 X 2 + O13X3 + 014X4,
0 2 lX 1 + O22X2 + O23X3 + O24X4,
<>31*1 + 032*2 + Oaa*3 + 034*4,
corresponding to the three possible choices i » 1, 2, 3 of the free index i .
, The summation index j is often called the dummy index because it can be
replaced by any other letter having the same range of summation. The
SBC. IS] UNEAR VECTOR SPACES AND MATRICES 325
dummy index is analogous to the variable of integration in a definite
integral, which can also be changed at will. Thus
QijX-iXj — GkrXkXff
it being understood that the indices i> j, r range over the same sets of val-
ues. Unless a statement to the contrary is made, we shall suppose that
the indices have the range of values from 1 to n. We shall thus write
formulas (14-4) and (14-5), for example, as
y = y^i,
and (14-13) as
x *y ** Wif
x*x = £ t x t .
We shall make use of this summation notation among other places in writing
formulas for the product of determinants and for the expansion of deter-
minants.
We recall that a determinant
a n
012 ’
*01 71
a %J | —
<*21
022‘ *
‘02 n
0nl
0n2 * *
* 0rm
of order n represents an algebraic sum of n\ terms formed from the ele-
ments a tJ in such a way that one, and only one, element from each row i
and each column j appears in each term. 1
The product of two determinants | a l} | and | b t j | , each of the order n,
can be written as a single determinant |r t; | of order n in which the ele-
ment c t} in the ith row and jth column is 2
Cij 8=5 CL x kb)k ~ CLtlbji + 0i2^j2 "T * * * *4“ ^mbjn- (15-2)
Inasmuch as the value of the determinant j 6* 7 1 is unchanged when its
rows and columns are interchanged, the value of the determinant
I C ij I 2=1 \ a i)\ | 6 .j|
with the elements (15-2) is the same as that of the determinant |c*/| with
the elements
Cij ** ‘kbfcj &i\b\j T* 0>i2^2j "f" ' * * T* CLi n b n j. (15-3)
1 A discussion of determinants is contained in Appendix A.
* Since the number n k fixed, the term a tn b 3 „ does not represent the sum of terms with
respect to n. Here n knot a summation index. Cf. Appendix A, Formula (1-10),
326
ALGEBRA AND GEOMETRY OF VEOTORvS. MATRICES [CHAP. 4
If the oof actor 1 of the element a %3 in the determinant (15-1) is denoted
by A ijy we can expand a, 3 in terms of the cof actors of elements in any
row or column of the determinant. A reference to (1-5) in Appendix A
will show that the following formulas include the Laplace developments
of (15-1):
(1%)* 1 ik
a? 7-1 ki
SjMtf
&jk»,
(15-
*-4)
(15-5)
where S 3 k is the Kronecker delta and a stands for the value of | a LJ j ; for
if in (15-4) k 9^ the expression a tJ A,f- represents the sum of products
of the elements in the jth column by tlie cofactors of (Ik* elements in the
kth column. The value of such a sum is zero, since ll represents the
expansion of a determinant with two like columns. If j = /;, the sum
a tJ A ik is the sum of pioducts of the elements m thejth column, by the co-
factors of those elements, vieldmg the value a — \a tJ 1 . Similar statements
apply to (15-5) if we replace the word “column” by “row.”
Formula (15-4) enables us to give a compact derivation of Cramer’s
rule for solving a system ot n linear equations
ci,jXj = b t (15-6)
in n unknowns x,
We multiply both members of (15-6) by the cofactors T,*. and sum with
respect to i. We get
ctijA t kX 3 “ A ikh t .
But by (15-4) this is
d jk axj = A lk b l .
Tlie sum h 3 kX 3 = Xk, and we conclude that
Ajklh
xk ~
a
(15-7)
whenever a ^ 0. The numerator in (15-7) is the determinant obtained
by replacing the elements in the Ath column of L/ j; | by the h t . The reader
finding the foregoing calculations too concise will find a more expansive
discussion in Sec. 2 of Appendix A.
PROBLEMS
1. Write out the following expressions in full.
(a) 6 tJ a t ; (b) M a ub * (<0 (*) J dr lf (/) ^ dr/; (</) u tl ; (h)
dx x 0.i a ,
ai « Z ~ bj; ( 1 ) a tJ a t k - S/k’, 0) (A) (() a l /r/, -= b x The symbols 5,,
dXj
denote the Kronecker deltas.
1 Bee Appendix A. We recall that th»* eofactor of a t} is the signed minor M tJ of the
element a, u the sign being ( — 1)'
SEC. 16] LINEAR VECTOR SPACES AND MATRICES / 327
2 . Write out the determinants represented by the expansion u®Aa and a$ % An, where
A i, is the cofartor of the element a tJ in \a X3 1 . Also write out the determinants represented
by a&A t 2 and n^A t ^.
Z. Expend the determinants:
Gil '02 0 13 Oi4
0 fl22 «23 «24
(<0 f\ A
0 0 aw U34
0 0 0 044
111 a! 0 0
(c) 1 1 x,\ ; (d) a>2 0
.r'f sj jr£ 03 b-\ r {
4. Multiply the determinant (h) in Prob 3 by the determinants (r) and (d).
16. Matrices. In this section we introduce the concept of a matrix and
discuss some rules of operation with matrices which are of value in the
study of linear transformations.
An m X n matrix is an ordered set of mn quantities a XJ arranged in a
reel angular array of m rows and n columns. If m ~ «, the array is called
a square matrix of order n. The quantities a l} are called the elements of
the matrix. Thus, a matrix is an array
On a 12 * ’
a :} a >2 * 02n
tlmi a m2 ’ * * U m n
where paren theses are used to enclose tin* array of elements. We shall
denote matrices by capital letters, or when it is desired to exhibit a typical
element of the matrix (16-1), we shall write {a a ).
If the order of the elements in (lb- 1 ) is changed, or if any element is
changed, a different matrix results For example, a triple of values (a l}
] representing the cartesian coordinates of a point is a I X 3 matrix.
If X a 2 , the matrix obviously represents a different point.
Two m X n matrices A ~ (a u ) and B - (b tJ ) are said to be equal if,
and only if, a tJ = b i} for each ? and j. That is, A - B only w T hen the
elements in like positions of the two arrays are equal.
We define the sum .1 + /> of two m X n matrices A = (<o ; ), B = (h l3 )
to be the array
A+B = (« f . + ft*,), (H>-2)
and their difference A — B to be the array
A — B ~ {a tl — btj).
We shall agree to say that the product of the matrix A ~ (a tJ ) by a con-
328
ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
giant k , written hA, is a matrix each of whose elements is multiplied by fc.
Thus kA « (fca, ; ).
If we have an m X n matrix A and an n X p matrix B ) we define the
product of A and B t written AB , by the formula
AB = (a ts b ik ), (16-3)
where, as agreed in Sec. 15, the repeated index j is summed from 1 to n.
Thus, the product AB is an m X p matrix, and we can multiply two
matrices only if the number of columns m the first factor is equal to the
number of rows in the second.
Example 1. If
and
ab <
p 5)
a + 2
and
0-1 2 + 1
/(1)<2) + (0) (0) + (2)(1) (I)(~ 1) +
V0)(2) 4 (5) (0) 4- (6)(1) (OX- 1)4
f 2
-1 1
R -
0
1 A
Vi
-2 -]
\
/3
-1 3X
Oh
2
0 5 J
\\
3 5/
b (2) ( -
2)
0)0) 4-
1 (3)(-
2)
(2)0) 4- (-
< 6 )(-
2)
(0)(1) 4-
(oun t* (2)(— 1)\
(5) (2) -f (6)(-l)/
Also, if
then
AC
C -
/I 0 2w 2\ /(l)(2)4 (0)(3) +(2)(-lK / 0\
(2 -1 3 J f 3 } *= ( (2)(2) 4 ( — 1)(3) 4 (3)( — 1) ) ** ( —2 )•
\0 5 6/ \ — 1 / \(0)(2)+ (5)(3) 4 (6)( — 1)/ \ 9/
We observe that the rule (16-2) for the addition of matrices requires
that A 4* B = B 4 A } but it does not follow from (16-3) that the order
of factors in the product AB can be interchanged even when the matrices
are square. Indeed, for
the rule (16-3) gives
;). whii. (°
Thus, the multiplication of matrices, in general, is not commutative.
LINEAR VECTOR SPACES AND MATRICES
329
SEC. 16J
However, if we have two square matrices of order n which have zero
elements everywhere except possibly on the main diagonal, then it follows
from (16-3) that
fax
0
0
0 •••
°\
0 •
•' °\
ja 1 b l
0
•' °\
fig ’ * •
0
t> 2 - •
-i
0
a^b 2
o
aj
\o
0 •
.. J
0
0 •
• • o, n bJ
Such matrices are called diagonal .
Thus for two diagonal matrices A and B y
AB « BA,
A diagonal matrix in which all elements along the main diagonal are equal
is called a scalar matrix. A particular scalar matrix
/i °--' 0 \
0 1 ••• 0
\o 0 • • • 1 /
(16-4)
is called the identity ( or unit) matrix .
We note that if I is the identity matrix and A is any square matrix,
then 1
I A - AI ~ A. (16-5)
By analogy with the rules of ordinary algebra, we define the zero matrix
0 to be the matrix such that
O + A * A.
It follows from (16-2) that all elements of the zero matrix are zeros. We
observe that the product of two matrices may be a zero matrix even when
neither of the factors is a zero matrix. Thus, if
A «
d 1
0 0
^0 1
and
then
/0 0 (X
AB = | 0 0 0 V
V) o <y
1 More generally we can show that if AX XA for every matrix A } then X is a scalar
matrix. See Prob. 6.
330
ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
If the matrix is square, it is possible to form from the elements of the
matrix a determinant whose elements have the same arrangement as those
of the matrix. This determinant is called the determinant of the matrix.
From any matrix, other matrices can be obtained by striking out a number
of rows and columns. Certain of these matrices will be square matrices,
and the determinants of these matrices are called determinants of the
matrix. For an m X n matrix, there are square matrices of orders 1, 2,
. . . , p, where p is equal to the smaller of the numbers m and n.
Example 2. The 2X3 matrix
4 m / a u a l2 tfl3\
\«21 «22 <*23/
contains the first-order square matrices (an), (a^), ( 023 ), etc., obtained by striking out
any two columns and any one row. It also contains the second-order square matrices
( «n &is\ /an flii\ /ai 2 <Ji3\
«21 022/ * \0n O23/ ' V022 o 23/ 1
obtained by striking out any column of A.
In many applications, it is useful to employ the notion of the rank of
a matrix A. This is defined in terms of the determinants of A. A matrix
A is said to be of rank r if there is at least one r-rowed determinant of A that
is not zero, whereas all determinants of A of order higher than r are zero or
nonexistent. 1
Example 3. If
/ 1 0 1 3\
A m ( 2 10 —2 J,
V-l -1 1 5/
the third-order determinants are
1 0 1
2 1 0
-1 -1 1
1 1 3
2 0 —2
-1 1 5
0,
0,
0
1
-1
0,
0 1 3
1 0 —2
-1 1 5
0.
Since
1 0
2 1
3* 0,
there is at least one second-order determinant different from zero, whereas all third-order
determinants of A are zero. Therefore, the rank of A is 2.
It should be observed that a matrix is said to have rank zero if all its
elements are zero.
1 Cf. Appendix A, Sec, 2.
LINEAR VECTOR SPACES AND MATRICES
331
SEC. 16 )
If A * {dij) and B = (b tJ ) are two square matrices, then
AB = (a ik b kj )
and the determinant of the matrix AB is
\AB\ — | a x khkj | . ( 16 - 6 )
We note with reference to (15-3) that the elements in the zth row and jth
column of the determinant in (16-6) are precisely those that appear in
the product of two determinants | A | = | a tJ | and \B\ = \b x} \. Thus
\AB\~\A\-\B\, ( 16 - 7 )
or in words, the determinant | A B | of the product of two matrices A and B
ts equal tx> the product of determinants |.1 | and
It follows from (16-7) that whenever the product of two matrices is a
zero matrix, then the determinant of at least one of the factors is zero.
A square matrix whose determinant is zero is called a singular matrix .
PROBLEMS
1. Make use of the definitions in Sec. 10 to establish the following theorems fot
matrices:
(a) A + B - B + A; (b) (A -f B) + C « A + (B + C);
(c) (A + B)C - AC + BC ; (d) (\A + B) - CA + CB.
2 . Verify that the matrices A and B in Example 1 of this section do not commute.
3 . Multiply:
(a)
1 2 3
3 1 2
1 3 2
(/>)
1 2 3
3 1
1 3 2
0
0
0
4 . Show that (AB)C - A(BC).
5. Determine the ranks of the matrices:
1 2 3
1 4 2
2 6 5
/l 0 1
i? - { 0 0 1
U 1 1
D —
-4
1
Is AB « BA? Is AE — EA? Are these matrices singular?
6. If AX — XA for every matrix A, show that X is a scalar matrix. Hint: Let
X — (x t j), then since AX - XA, a %) x t k - x^Uji for all choices of a tJ and a 3 Now choose
«»/ * where $,(p> and b } ^ are the Kronecker deltas and p and q have fixed but
532 ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
arbitrary values ranging from 1 to n, and conclude that ■» xy, 0 if i j and
a ? % i ** Xjj for each i and j,
17. Linear Transformations. The matrix notation introduced in the
preceding section enables us to study effectively properties of linear trans-
formations.
A set of n linear relations
Vt ~ iy j == 1 ) 2 , . . . 7 fly ( 17 - 1 )
where the a», are constants, defines a linear transformation of the set of
n variables x x into a new set y x .
We can regard the quantities x iy x 2 , . . x n as components (or measure
numbers) of some vector x referred to a set of base vectors a*, a 2 , . . a n
in the n-dimensional vector space. The quantities y i} y 2 , . . y n can be
viewed as components of another vector y referred to the same basis.
The relations (17-1) then represent a transformation of the vector x into
another vector y. Since the lengths of x and y and their orientations
relative to the base vectors a, are different in general, we can look upon
the transformation (17-1) as representing a deformation of space.
When the components of x and y are represented by the column ma-
trices
n
H
*
j ;
Y «-
f */2 1
\j
■
\,j
the set of relations (17-1) can be written in the form
Y * AX, (17-2)
where A = (a XJ ) is the matrix of the coefficients in the linear transforma-
tion (17-1) and the product AX is computed by the rule (16-3).
If A is a nonsingular matrix, we can solve Eqs. (17-1) for the x % by
Cramer’s rule (15-7) and obtain the inverse transformation
(17-3)
where A x j is the cofactor of the element a l} in the determinant a » |a, 7 |
erf tjie matrix A .
The set of equations (17-3) can be written in matrix notation as
X « A~ l Y ,
sec, 17]
where A~ l is
UNEAR VECTOR SPACES AND MATRICES
333
1 ^
1— *
I *-*
A21
A n \\
a
a
a \
f An
1 ^
I W
^4n2
S3S
a
a
a
\ Ain
A2n
Ann j
' a
a
a
(17-4)
It is natural to call A~~ x the inverse matrix of A. We note that the
inverse matrix can be constructed whenever A is nonsingular, that is,
whenever the determinant \A | m a does not vanish.
If we form the product of A and A “” 1
AA~ l = (a, k (17-5)
and recall 1 that
atkAjk = 5 X jC t,
we can write (17-5) as
A A~' = («„) = I, (17-6)
where I is the identity matrix.
Since the determinant of the product of two matrices is equal to the
product of their determinants, we conclude from (17-6) that
\A- i A\ = \A~ l \-\A\ = \I\= 1,
so that
1
Ui
(17-7)
Multiplying (17-6) on the left by *4“ l and on the right by (A~ l )~ l gives
A~ X A = 1 ~ AA~ l . (17-8)
In addition to the inverse matrix A “* 1 we shall make frequent use of the
matrix
l<l\l #21 * * * #nl\
#12 #22 ' * • a n2
(17-9)
V*
In #2 n ' ' ' ei n
1 See (15-5), but note the relation of the subscripts on the to the rows and columns
in (17-4),
334 ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
obtained by interchanging the rows and columns in the matrix
j a ll O] 2 * * * #uA
#2! #22 * * * #2n
'#/il a n‘2 * * * find
(17-10)
The matrix A' is called the transpose of A,
On using the laws of addition and multiplication of matrices it is easy
to show that
(A + By « A' + B\
(k AY - AvT,
(Any - b’a\ (i7-ii)
(Note order.)
If we recall the relation (17-8),
A~ l A - AA~\
and form the transpose
(A~ l A)' = (AA~ l y,
w© get, on making use of (17-11),
A'{A- l y - (A~ l yA f . (17-12)
Multiplying both members of (17-12) on the left by C4')~\ we get
(d'l^d'Ol- 1 )' = (ri'r^i” 1 )'.!'.
Hence (A~ l y = (A')~ l (AA~ 1 )' « Cl')" 1 .
Thus (A~ l ) f = WO” 1 . (17-33)
The important result embodied in (17-13) slates that the inverse of the
transpose of the matrix A 'is equal to flu Inuisposi of its inn esc.
In many calculations it is necessary to compute the inverse of the
product of two nonsingular matrices A and B. We can obtain the desired
result as follows: Since
(AB)(AB)~ X = /,
or (see Prob. 4, Sec. 10)
AB(Ali )- 1 = I,
wo get, on multiplying both members of this relation on the left by A -1 ,
A~ l AB{AB)~ l = /l"" 1
or B(AIS)~ X = A~\
Multiplying this result on the left by B~ l , we get the desired result
{AB)~ l = B~ X A~ X .
(17-14)
SEC. 17] lil NEAR VECTOR SPACES AND MATRICES 335
(Note order.) This result can be extended in an obvious way to more than
two matrices, so that, for example,
(ABC)” 1 ~
Example 1. Compute A~ l for the matrix
-a ->
A n m ~1, A 12 * —3; An * —2, A 22 ~ 1.
Since a *» I A 1
We note tlmt | A 1 |= s ~'H®'1/bt|.
Example 2 If A is a iionsingulai matrix, show that the matric equations
AX • I and XA « 1
have unique solutions A T = A ~~ l .
On multiplying both members of the given equations by A” 1 , we get
A- 1 AX « A" 1 / and XAA~ l « IA~K
But A -, A ~ A A “ l ~ / and A" 1 / - /A"" 1 - A~\
If we have two successive linear transformations
Vi = flub,
(17-15)
the direct transformation from the variables .r t to the z, is obtained by
inserting for the y } in the second set of Eqs. (17-15) from the first set. We
thus get
2* = bt/ijkT a. (17-16)
The transformation (17-16) is called the product of the transformations in
(17-15). If the variables (jr^r^, . . . , r M ), O/i AJh • ■ • djn), and (z u z 2 ,. . .,x„)
are interpreted as components oi the vectors x, y, z, represented by column
matrices
336 ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
we can write Eqs. (17-15) as
Y m AX ,
Z ^ BY,
and the product transformation (17-16) as
Z = BdX.
(17-17)
(17-18)
Thus , when the variables are subjected to a linear transformation (17-15)
with a matrix A and the variables are subjected to a linear transformation
with a matrix B, the product transformation has the matrix BA . Since BA
in general is not equal to AB , the order in which the transformations are
performed is material.
When it is desired to interpret Eqs. (37-15) as transformations on the
components of the vectors x, y, z, Eqs. (17-17) and (17-18) can be WTitten
in the forms
y = Ax,
z - By,
(17-19)
z - BAx }
where x, y, z are regarded as the column matrices X , F, Z, respectively.
The matrices in Eqs. (17-19) can be viewed as operators transforming a
given vector into another vector. Since
A (kx) = kA x, k const,
and A(x + y) « Ax + Ay,
one often speaks of A as a linear operator.
PROBLEMS
find A~ l and Verify that (AB)' - B'A' and (AB)~~ l * J 5 ~ 1 A~ 1 .
2 . Prove that (ABC)' - C'B'A'.
3 . Prove that (A “ 1 )~" 1 ** A.
4 . Prove that if A is singular, there exists no matrix B such that AB » /.
5 . If yi » zj cos a — X2 sin a, 1/2 »* £1 sin a + X2 cos a, find A~ l t A' and show that
A "" 1 ** A*. If x is a vector with components (r^a*), what is the geometric relation of
i toy? Write out the inverse transformation x « A ” 1 y.
6 . If yx ** x% - £2, 1/2 « *f *2, what is A*" 1 ? Is A ” 1 « A'? If x is a vector with
components (xi,X2)> what is the geometric relation of x to y?
SBC. 18]
7. If
LINEAR VECTOR SPACES AND MATRICES
337
i , i
Vl “^ Xl + ^ Xi ’
yt «•
compute the matrix A ' for the inverse transformation and compare it with the given
matrix A. If x is a vector with components ( x\,x%,xz ) and y is a vector with components
(Vi.I/ 2 , 1 / 3 ), what is the geometric relation of x to y?
8. Let
and consider the vector
a -G a)
y » Ax.
Compute x « A* x j. Is it true that A* A l ?
9. If
yi = T\ cos a -f xi sin a,
y 2 * ~~x\ sin a + cos a,
and
Zi ** j/i cos 0 + 2/2 sin 0,
» — 2/1 sin 0 -f y 2 cos 0,
find the product transformation directly and also by computing the product of the
matrices as in (17-18). Compute BA and AB, A~ l B~ l , and (BA)* 1 . Also find (BA)'
and compare it with (BA)* 1 .
10. If
yi « 2 ji -f X2 ,
1/2 * *1 ~ * 2 ,
and z\ ** y\ - 2 / 2 ,
Z2 * 2 y\ + 2/2,
perform the calculations required in Prob. 9.
18. Transformation of Base Vectors. In the preceding section we in-
terpreted the set of linear relations
Vi ~ Gift] (18-1)
as transformations of components (xift 2) . . x n ) of a vector x into com-
ponents (yi,t/ 2 r * * *,2/n) of another vector y when the vectors are referred
to the same basis (a,*), so that
x * x*a, and y ® y t &i. (18-2)
If we introduce a new system of base vectors o tJ obtained from the set
a, by a linear transformation
338 ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
the vectors x and y in the new reference system will have certain representa-
tions „ /to 4 K
x = £*«*, y * v i«i* (18-4)
We raise two questions: (1) What is the relation of the components of
vectors in the two representations (IS-2) and (18-1) when the base vectors
are transformed by (18-3)? (2) What is the form of the transformation oi
the components £, into *?, which corresponds to the deformation of the
vector x characterized by Eqs. (18-1)?
To answer the first question we insert from (18-3) in (18-1) and get
x - Ihj^sLj, (18-5)
while a reference to (18-2) shows that
x = a,.r, “ Si r rj. (18-6)
From (18-5) and (18-0) we cone) ude that
•r, ~ M*. U# 7'
This formula is the desired relationship connecting the < <>mponcnU of x
when it is referred to two different base systems i da ted bv r \ 18-3
We note that in the transformation (l<S-3) the summation is on tlie
second index.; while in (18-7) it is on the first index. In other words, the
matrix of coefficients b tJ in (18-7) is the transpose of the matrix (b l} ) in
(18-3).
If we write the matrix in (18-3) as
(b tJ ) - H,
the set of equations (18-7) e in be wi it ten as
x - />"|, (18-8)
x and | being the column matrices with components (t’i r r 2} . . .,r w j and
(£bi;2> • • • i £?»)■
On multiplying (18-8) by (li , )~~ l on the left we get the solution for £
in the form
$=(«';- ‘x.
Formulas (18-8) and (18-9) give a complete answer to the first question
The relationship connecting the components (;/i,;/ 2 , . . , } y n ) with (?? l3
i) 2 j * . -jVn) can be represented similarly by
y = B't\ and -q - (ft'rV (18-10)
We proceed next to the answer of the question concerning the form of
the deformation of space (18-1) in the new reference frame a t .
We write Eqs. (18-1) in matrix form as
y - Ax,
(18-11)
SBC. 18] UNBAR VECTOR SPACES AND MATRICES * 339
substitute for x from (18-8) and for y from (18-10), and obtain
B' 1] - AB'b (18-12)
To solve for r\ we multiply on the left by ( B')~~ l and get
tl = (18-13)
Thus the relationship between the components (fi,$ 2 ,...,£«) and (r}i,V 2 >
. . ,,r) n ) is determined by the matrix
8 m (B')~ l AB'. (18-14)
Since the matrix S characterizes the same deformation of space as the
matrix A , the matrices A and S related in the manner of (18-1 1) are termed
dmilar . To avoid carrying primes, we set B' ~ (\ and formula (18-14)
(hen assumes the form
S m C~ l AC, (18-15)
and (18-13) becomes
t, - SI (18-10)
One of the important problems in the theory of linear transformations is
to determine a reference frame in which the equations tor the deformation
of spare assume forms v hub admit of simple interprets ions. For example,
if it proves possible to find a matrix C such that the matrix S in (18-15)
has the diagonal form
S =-
then Eq. (18-10) shows that
/X,
o ■
°\
0
x 2 •••
.*
• \J
Vl
->J7
il
V2
Vn
(18-17)
In three-dimensional space these correspond to simple elongations (or
contractions) of the components of the vector in the directions of base
vectors a t determined by the matrix C ~ B f [see (18-3)].
Whether or not a matrix C reducing A to the diagonal form S can be
found clearly depends on the nature of deformation specified by A. In
many problems in dynamics and in the theory of elasticity, the deforma-
tion matrix A will turn out to be symmetric, and we shall see in Sec. 20
that such matrices can always be diagonalized by finding a suitable ma-
340 AliGEBRA AND GEOMETRY OP VECTORS. MATRICES (CHAP. 4
irix C. This fact turns out to be of cardinal importance because it enor-
mously simplifies the analysis of many problems.
In the following section we shall study properties of the matrix A in
(18-1) for those transformations that leave the length of every vector x
unchanged. In three dimensions such transformations represent rotations
and reflections.
19. Orthogonal Transformations. Let us refer our ?i-dimensional space
to a set of orthonormal base vectors e2> . . ©n, introduced in Sec. 13.
Relative to this basis the vector x has the representation
x = e&i
and its length |x| can be computed from the formula
| x | 2 = x t x t . (19-1)
Let us investigate the structure of the matrix A in the class of transforma-
tions / 1 f \ r>\
Vi * o tJ x } (19-2)
which leave the length |x| of the vector unchanged. Now, the square
of the length of the vector y is
lyi 2 = 09-3)
and since we suppose that | x | « | y | ,
y<Vi « x % Xi. (19-4)
We insert in (19-4) from (19-2) and get
(a %j Xj)(a ik Xk) * x,x t
or <itja t }rXjX k *» d^x *, (19-5)
since &i&jXi k = *kk = x t x { .
On equating the coefficients of Xjit in (19-5), we get the set of restrictive
conditions t /ir4
d% j&lk §)k (I9~(i)
on the coefficients a t; if the transformation (19-2) is to leave the length of
every vector unchanged.
Equations (19-6), when written out for n = 3, are
a ii 4" a ti 4* &ii ** 1,
a 12 + d %2 + U32 ~ 1,
a l3 + &2Z 4* O 33 335 1>
Ol2 a l3 4* ^22^23 4“ U32O33 ** 0,
013^11 4* «23&21 4* G33U31 = 0,
a u ai2 4“ G21G22 4- <*31^32 m 0.
341
SBC. 19 ] LINEAR VECTOR 8FACES AND MATRICES
The determinant of the matrix in (19-6) is
I diflik I = | Sjk | = 1, (19-7)
and if we recall the rule for multiplication of determinants [cf. (15-2)],
we conclude from (19-7) that
I aifiik I = I a,j I • I aij I = a 2 - 1, (19-8)
where a is the determinant of (ay). Equation (19-8) states that
d = dr 1 .
In three dimensions the situation when a = 1 corresponds to a rotation
of space relative to a set of fixed xyz axes determined by the unit vectors
i, j, k. The circumstance when a « — 1 corresponds to a transformation
of reflection (say, x = -~x, y « ~~y, z ~ — z ) or to a reflection followed by
a rotation*
A transformation (19-2) in which the coefficients o t j satisfy (19-6) is
(‘ailed an orthogonal transformation; it is called the transformation of rotation
if |a i; j = 1, whatever be the dimensionality of space.
If we denote by A ' the transpose of ( a tJ ) ~ A in (19-2), we can write
the orthogonality condition (19-0) in matrix form as
A' A « I. (19-9)
On multiplying this by A on the right we get
A' ~ A - 1 . (19-10)
Thus, in an orthogonal transformation the inverse matrix A~ l is equal to
the transpose A ' of A .
When Eqs. (19-2) are written in the form
y - Ax,
we can write their solutions for the x x as
x « A~ l y. (19-11)
We conclude from (19-10) that the solutions of Eqs. (19-2), when the
transformation is orthogonal, are
x % = ajiyj. (19-12)
In Sec. 17, we saw that the matrix of the product of two linear trans-
formations is the product of the matrices of the component transformations.
Using this fact and the property (19-10) it is easy to show that the product
of two orthogonal transformations is an orthogonal transformation .
342
ALGEBRA AND GEOMETRY OP VECTOkS. MATRICES
[CHAP* 4
PROBLEMS
1. Verify that the transformations
(a)
1/1
* xi cos a x* sin a,
1/2
* X\ sin a -}- X2 COS a,
and
(b)
1 1
2/1
= VT 1 + ^ X3 '
V2
= X-2,
1 l
.V3
are orthogonal. Do they represent rotations?
2. Discuss the transformation
Vl ~ 3 V 2 il+ 3 h' l+
J/3
2 2 1
3" J+ 3" 2 “3 JS
Find the inverse transformation,
3 . Prove that the product of any number of orthogonal transformations is an orthog-
onal transformation.
4. Tf A i.N a symmetric matrix (so that A ' — A) and 8 is an orthogonal trariofoi mation,
prove that the matrix B » S ~ is symmetric Thus, orthogonal transformations do
not destroy the symmetry of A.
6. J*‘t
( 1 1 0
1 2-1
o-i a
and let C be an orthogonal matrix
f c it
02
c 1 A
^21
C22
C23 1
C32
C33/
Write out the set of equations which the c„ must satisfy if C~ l AC * S, where $ is a di-
agonal matrix
6, Is the transformation
Vi » 3xi — 3%
3/2 ~2xi +
orthogonal? Find the inverse transformation. Determine the components of x: (x h i 2 )
and y: (yi,yt) when the base vectors ex, e 2 are rotated through 45 and 90°.
LINEAR VECTOR SPACES AND MATRICES
343
sec. 20]
7. If y% «* OijXj is a linear transformation for the components of a complex vector
x: (a?i,X2, . . . t z n ), which preserves the length |x| of the vector, show that dtjaik ** &jk
or X'A ** I, where X is the conjugate matrix formed by replacing every element a,y of
A by a X) . Transformations such that X' «* A ~ 1 are called unitary; they are of great
importance in quantum mechanics.
20. The Diagonalization of Matrices. We saw in See. 18 that the
determination of a nonsingular matrix C such that the given matrix A
reduces to the diagonal form S by a similitude transformation C~~ l AC is
equivalent to determining a set of base vectors relative to which the trans-
formation
Vi = a t jXj (20-1)
assumes the form
Vl *= ^lft, V2 ** ^2$2j •••» Vn = (20-2)
We thus seek a solution of the matric equation
C~~ l AC - S (20-3)
in which A = ( a XJ ) is a given matrix, C the unknown matrix
Mi
<12 * *
* Mfc
. . .
C ln\
c=h
c 22 * •
* ^2 k
• # *
C2n
j
(20-4)
V n l
Cn2 ’ •
' Cnh
C nn l
and S is the diagonal matrix,
A,
0
°\
4
X 2
. . .
1
0
•
(20-5)
\o
0
kJ
On multiplying (20-3) on the left by C we get an equivalent matric equation
AC = C*S, (20-6)
provided that the solution of (20-6) yields a nonsingular matrix C.
Now the matric equation (20-6) is equivalent to a system of linear equa-
tions
a t jCjk = Cikhk, no sum on k, k — 1, . . n (20-7)
obtained by equating the corresponding elements in the products AC and
CS.
For every fixed value of h, the system (20-7) represents a set of n linear
homogeneous equations for the unknowns (cu,c 2 fc,. . . ,c„*) appearing in the
344
ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
kth column of (20-4). The fact that the system (20-7) is homogeneous
can be made plainer by rewriting it in the form
(Uij — = 0, no sum on k. (20-8)
We recall 1 that a system of homogeneous equations has solutions other
than the obvious solution eq*, ~ C 2 k ~ ~ c n * — 0 if, and only if, its
determinant 2
|0 fJ - X6„| - 0. (20-9)
On writing out this determinant in full,
a n “ X Ul2 * ‘ * a ln
<121 ^22 — X • • * &2 n
* 0 , ( 20 - 10 )
Rnl Un2 * 4 ' Qnn X
we see that (20-10) is an algebraic ecjuation of degree n in A. Accordingly,
there are n roots of this equation, say X — Xi, X = X 2 , ...» X = X n , and
corresponding to each root X = X^ (A = 1, the system (20-8; will
have a solution
(pik)<'2ki * ♦ • fnk)* (20- 1 1 )
The solution (20-11) yields the Ath column of the matrix C. If the roots
\ x , X 2 , . . ., X„ are all distinct, one can prow* that the matrix C will be non-
singular. 3 When the roots A* are not distinct, it is impossible, in general,
to reduce A by the similitude transformation (20-3) to the diagonal form,
because the desired nonsingular matrix C may not exist. In important
special cases, however (for example, when A is a real and symmetric ma-
trix), one can construct C such that has the diagonal form even when
some, or even all, roots are equal.
A brief discussion of this is contained in the following section.
As a matter of terminology, Eq (20-9) is called the characteristic equation
and its solutions are characteristic values of the matrix (a tJ ). The solutions
(20-11) of the system (20-8) corresponding to these characteristic values
are called characteristic vectors , 4
1 Appendix A, Sec. 2,
* Note that this determinants! equation when written in matrix form is | A — \f | ® 0.
* Or in the language of vectors, if we regard each column of C as a vector c {k) :
(ciktC 2 k f > • *,Cnk), the vectors c {k) (k « 1,2, . . n) will be linearly independent. A simple
proof of tiiis is given in I. S. Sokolnikoff, “Tensor Analysis," pp. 33-34, John Wiley
Sons, Inc., New r York, 1951.
* The hybrid terms eigenvalues for the A* and eigenvectors for the c (&) : . ,,CnJt)
Are used by some writers who do not mind mixing German with English.
34S
SEC. 20J UNEAR VECTOR SPACES AND MATRICES
Example 1. Reduce the matrix
A m ( oij ) * j)
to the diagonal form S by the similitude transformation C~ l AC.
The characteristic equation (20-9) hero is
( 20 - 12 )
Its solutions are Xi *» 0, X 2 » 2. The desired matrix C in our case has the form
the columns in which satisfy the system of equations (20-8), yielding
(an — XjOci* 4 aiar^ * 0, no sum on k,
021 O* 4 (022 — Xjt)c 2 i = 0 , k *» 1 , 2 .
Since an — 1, 012 — — 1, <*2 1 ** — 1, 022 ** 1, we get, on setting k 1 and Xi
c n - C21 0,
~Qi 4 <’21 * 0.
(20-13)
(20-14)
0,
(20-15)
As is always the case with nontrivial homogeneous systems of equations, 1 there fire
infinitely many solutions of the system (20-15). If we set rn = a (any constant), Eqs.
(20-15) give rut =* a.
Thus the vector c m . ( 01 ,^ 21 ) appearing in the first column of (20-13) has the compo-
nents oj — C 2 i ~ a. Since any matrix C accomplishing the reduction will do, we can
take 2 a *■ 1.
The substitution of k * 2 and Xj « 2 in (20-1 1) yields the system
(1 - 2)fj2 - t’22 * 0,
— C12 4(1 — 2)^22 ** 0,
or
— c 12 *— C 22 30 0.
Again there are infinitely many solutions, and if we take c& * a, then c 2 2 ** —a. We
can set a » 1 if we wish, so that the elements of the second column in (20-13) are C 12 *» 1^
022 « — 1 . The desired matrix C, thei efore, is
C -
(!
The inverse of C is easily found to be
C~ l *®
so that C~ l A C is
(H ¥t \ ( 1 -i\ /i
\H -W ' V-i 1 / ' M
Vi \
W 2 - y 2 ) ’
1 See Appendix A, Sec. 2.
* Usually one normalises solutions so that the length ef the column vector c (fc) is 1,
This would correspond to the choice of a *» l/\/2. since c\i 4 c\% - 1.
346 ALGEBRA AND GEOMETRY OF VECTORS, MATRICES [CHAP, 4
On multiplying these matrices we get
as we should, since
(20-16)
as we knew from the start [see (20-5) J.
If we interpret A as a matrix operator characterizing the deformation of a vector x
into a vector y [see (18-11)], the result (20-16) states that in a suitable reference frame
the components of x and y are related by
vi =* 0*$i, m = 2&.
We thus have a deformation of space corrasponding to the twofold elongation in the
direction of one of the base vectors. In the notation of Sec. 18, C * so that one can
actually write out Eqs. (18-3) for the transformation of the base vectors. This, how-
ever, is seldom required because the essential matter is to determine the deformation
characterized by A rather than a reference frame giving a simple form of the deformation.
Example 2. Determine the characteristic values of the matrix
/ 1 - 1 - ] \
(o„) - ^-1 1 -1 j- (20-17)
The characteristic equation this time is
1 - X
-1
-1
(1 - X ) 2 - 3(1 - X) - 2 - 0.
We easily check that the solutions of this cubic are X] = 2, Xg = 2, X 3 =*» — 1 . Since we
have a double root Xi *» X 2 — 2, the solution of the system (20-8) will enable us to de-
termine only two linearly independent columns of the matrix C. The matrix (20-17),
however, is real and symmetric , and one can, in fact, construct the third column of the
matrix C such that
C~ l AC - 8,
( Xi 0 0\ /2 0 0\
0 X 2 0 J * f 0 2 0 J •
0 0 Xj/ Vo 0 - 1 /
However, the theory presented in this section does not explain how this can be accom-
plished.
1. Diagonalize the matrix
PROBLEMS
and determine, in the manner of Example 1, the matrix C. Discuss the meaning of A
when viewed as an operator characterizing a deformation of space.
347
SEC. 21] LINEAR VECTOR SPACES AND MATRICES
2 . Find the roots of the characteristic equation for the matrix
3. Find a matrix C reducing
(
4-{o-
\ 0
to the diagonal form by the transformation C
4 . Diagonalize the matrix
2 4
4 2
—6 —6
6. Prove that the roots of characteristic equations of all similar matrices are equal.
Hint: Write the characteristic equation of [of. (20-9)] in the form | C~ l AC — X/|
- 0. But | C-'AC - X/ 1 - |C- l (A ~ X7)C| » |4 ~ A/ 1, since 1C" 1 ! « 1/|C|.
21. Real Symmetric Matrices and Quadratic Forms. Let the matrix
A = (a tJ ) in a linear transformation
Vi = at]*,, hi ** 1, . . n, (21-1)
be real and symmetric, so that A' = A (or a tJ * = a ;t )- We shall indicate
that in this case the matrix A can always be reduced by the transformation
C~~ l AC to the diagonal form S. Moreover, C can be chosen as an orthogonal
matrix; that is, a matrix such that C"" 1 ~ C' [cf. Eq. (19-10)].
Linear transformations with real symmetric matrices dominate the
study of deformations of elastic media. Real symmetric matrices also oc-
cur in the study of quadratic forms
Q(x h z 2 , . . . ,x n ) ss a X3 x % x^ h i * 1 , 2, . . . , n, (21-2)
which arise in many problems concerned with vibrations of dynamical
systems.
We can always suppose that the coefficients in a quadratic form (21-2)
are symmetric because every quadratic form Q can be symmetrized by
writing it as
Q = MK +
= bijXtXj,
in which the coefficients
M ( a ij + a n)
are obviously symmetric. Henceforth we shall suppose that our quadratic
forms have been symmetrized so that a %J —
It will follow from discussion in this section that the problems of reduc-
tion of the transformation (21-1) with symmetric coefficients to the form
= Xifi, 1)2 ~ . . * . 8=5 *nf« (21-3)
848 ALGEBRA AND GEOMETRY OF VECTORS. MATRICES (CHAP. 4
and of the quadratic form (21-2) to the form
Q *= Ai£i + + • * * + X n fn (21-4)
are mathematically identical.
We first note several properties of quadratic forms. If the variables x%
in (21-2) are subjected to a linear transformation
x, « c ty (21-5)
the form (21-2) becomes
Q = <**,(<>**£*) fork)
= aijf'ikf'jrt/cZr-
We denote the coefficients of r by 5jk r , so that
0 - bkrter, (21-0)
where V * a l} c lk c ]r . (21-7)
Since i and j are the summation indices and o tJ = a ;t , we see that the value
of bkr is not changed by an interchange of k and r. Titus, \u» conclude
that the symmetry of the coefficients in a quadratic form (21-2) is not
destroyed when the variables x % are changed by a linear transformation
(21-5).
If we write (21-7) in the form
bkr ~ CikittijCjr) >
we see that the sum a XJ c Jr is an element in the zth row and the rth column
of the matrix
AC -/),
or (dtjCjr) » (d,r).
The product c % k{a l} c )f ) = c tk d ir is the element in the kth row and the rth
column of the matrix CD. Thus we can write (21-7) as
B - CAC . (21-8)
The result (21-8) can be stated as a theorem.
Theorem. When the variables x t in a quadratic form (21-2) with a matrix
A are subjected to a hncar transformation (21-5) with a matrix C } the resulting
quadratic form has the matrix CAC.
If the linear transformation (21-5) is orthogonal, then C * C~~ l , and
hence (21-8) can be written as
» B - C~ l AC. (21.9)
We conclude from (21-9) that the reduction of a symmetric matrix to the
349
SEC. 21 ] LINEAR VECTOR SPACES AND MATRICES
diagonal form by an orthogonal transformation calls for a solution of the
matric equation
S - C~~ l AC. ( 21 - 10 )
This equation is identical with that considered in the preceding section.
When the roots of the characteristic equation
|«0 1 = 0 ( 21 - 11 )
are distinct and real, the method of Sec. 20 enables us to compute a matrix
C which can be shown to be orthogonal. As a matter of fact, the desired
matrix C can always be found whenever the matrix A is real and symmetric.
Moreover, it can be shown that the roots of symmetric real matrices are
invariably real. 1
The fact that the columns of C are linearly independent can be established easily when
A is symmetric and the X* an; all unequal. Let c be a characteristic vector for X and
let c' be a characteristic vector for a different value, X'. Then, from (20-7),
a t jCj «* Xr* and a tJ cj * XVj.
Multiplying these equations by cj and c t , respectively, gives
a ij c 'i c j “ Xclc, and Oj/y\ ~ XVjcJ
after summing on i. Since a tJ ■* a ;i , the left sides of these equations are equal. Hence
by subtraction,
0 * \CiC t — X'c t ri » (X — X0c»cJ.
Since X^X' we get c-c' *» « 0, so that the vectors c and c' are orthogonal and thus
linearly independent.
If the roots X, are all positive, Eq. (21-4) shows that the quadratic form
(21-2) assumes positive values for all nonzero values of the variables
Such quadratic forms are called positive definite, . They appear in numerous
investigations in mathematical physics.
An analogue of a symmetric quadratic form (21-2) in which the variables x, are com-
plex is a bilinear form 8
H «* OtfftXj ( 21 - 12 )
in which o,*y = & }% . Such forms are called Hermitian, and their matrices ( Oi } ) * A are
Hermit tan matrices. Since a,/ *» dj %t it follows that the elements on the main diagonal
of A are necessarily real and that
A' - I.
From the structure of (21-12) it follows that the Hermitian forms assume only real
values for arbitrary complex values for on taking the conjugate of (21-12), we get
H «* dijX x £j *= aji2 } Xi ** //,
which proves that H is real.
1 For proofs utilizing the notation of this section, see I. S. Sokolnikoff, “Tensor Analy-
sis/’ pp. 37-40, John Wiley k Sons, Inc., New York, 1951.
* Cf. Prob. 7, Sec. 19.
350
ALGEBRA AND GEOMETRY OF VECTORS. MATRICES [CHAP. 4
Hermitian forms occur in quantum mechanics, and a discussion of the reduction of a
quadratic form to a sum of squares (21-4) can be generalized to show that (21-12) can
be reduced to the form
H ®» Xi£i£j *+■ +■*** + Xft| n £ n
by a linear transformation (21-5) with a unitary matrix C defined in Prob. 7 of Sec. 19.
22' Solution of Systems of Linear Equations. In Sec. 15 we derived
Cramer’s rule for solving the system of equations
(lijXj = b>. ( 22 - 1 )
When the number of equations in (22-1) is large, Cramer’s rule is inefficient,
since it requires evaluating determinants of high orders. For this reason
all practical methods of solving the system (22-1) depend on reducing it
by some process to an equivalent system whose matrix is sufficiently
simple to enable one to compute the unknowns without great effort.
The system (22-1) can be written in matrix notation as
Ax = b, (22-2)
where A * (a#), x is the column matrix (ji,t 2> . . . } x n ), and b is the column
matrix (61,63, . • *,6 n ). If A is nonsingular, the solution of (22-2) is
x = A~ l b (22-3)
so that the determination of unknowns hinges on constructing the inverse
matrix A "~ 1 . The development of effective methods for inverting matrices is
a major problem of numerical analysis. One of such methods depends on
a reduction of the system (22-2) to an equivalent system
Bx - c, (22-4)
in which B has the triangular form
612
1>1 3 • •
&ln\
0
1
f>23
’ • b 2 n
'0
0
0 •
■ ]/
in which the elements below the main diagonal are all zero. When the
system (22-4) is written out in full, it has the appearance of Eqs. (4-2) in
Chap. 9, whose solutions, as shown in Sec. 4, Chap. 9, can be obtained
quite readily. 1
Among other methods for solving the system (22-2) is the method of
orthogonalization, the essence of which is as follows. Let us seek a matrix
C such that the product
CA - Z>
1 This is the so-called Gauss reduction method discussed in Chap. 9.
(22-5)
SEC, 22] LINEAR VECTOR SPACES AND MATRICES 351
is an orthogonal matrix. Since D is required to be orthogonal, it follows
from (19-9) that
DD r - D'D » /, (22-6)
where I)' is the transpose of D.
On multiplying (22-2) on the left by A'C'C, we get
A'C'CAx = A'C'Cb, (22-7)
and since
A'C' - (CAY * D f
by virtue of (17-11) and (22-5), we can write (22-7) as
D'Dx « D'Cb.
However, by (22-6) D'D = /, so that we finally have
x - D'Cb. (22-8)
Formula (22-8) gives the solution of the system (22-1) once a matrix C is
determined. We do not present the classical procedure for constructing
C (known as the Gram-Schmidt method) because of the rather special
character of the problem.
CHAPTER 5
VECTOR FIELD THEORY
Coordinates and Functions
1. Curvilinear Coordinates 357
2. Metric Coefficients 360
3. Scalar and Vector Fields. Gradient 367
4. Integration of Vector Functions 372
5. Line Integrals Independent of the Path 378
Transformation Theorems
6. Simply Connected Regular Regions 382
7. Divergence 384
8. The Divergence Theorem 388
9. Green’s Theorem. Line Integral in the Plane 391
10. Curl of a Vector Field 396
11. Stokes’s Theorem 400
Illustrations and Applications
12. Solenoidal and Irrotational Fields 402
13. Gradient, Divergence, and Curl in Orthogonal Curvilinear
Coordinates 405
14. Conservative Force Fields 408
15. Steady Flow of Fluids 411
16. Equation of Heat Flow 414
17. Equations of Hydrodynamics 416
355
This chapter is concerned with a study of scalar and vector functions
defined in the familiar three-dimensional space. It includes a discussion
of curvilinear coordinate systems and a derivation of several transforma-
tion theorems involving line, surface, and volume integrals. These theo-
rems, usually associated with the names of Gauss, Green, and Stokes, are
indispensable in the study of mechanics of fluids, thermodynamics, and
electrodynamics and in virtually every branch of mechanics of deformable
media.
COORDINATES AND FUNCTIONS
1 . Curvilinear Coordinates. The chief advantage of formulating rela-
tions among geometrical and physical quantities in the form of vector
equations is that the relations so stated are valid in all coordinate systems.
Only when one comes to consider a special problem involving numerical
computations does it prove desirable to translate vector equations into
the language of special coordinate systems that seem best adapted to the
problem at hand. For example, in analyzing vibrations of clamped rec-
tangular membranes, it is usually advantageous to express the displace-
ment vector in cartesian coordinates. In the study of heat flow in a sphere,
the geometry of the situation suggests the use of spherical coordinates,
while problems concerned with the flow of currents in cylindrical con-
ductors may indicate the use of cylindrical or bipolar coordinates. All
these coordinate systems are but special cases of the general curvilinear
coordinate system which we proceed to describe.
Let us refer a given region R of space to a set of orthogonal cartesian
axes y u y< h 2/3. We denote the coordinates of any point P in R by (2/1 ,2/2, 2/3)
(Fig. 1) instead of the familiar labels (x,y,z). A set of functional relations
*1 **
*2 « * 2 ( 2 / 1 , 2 / 2 , 2 / 3 ),
*3 * * 3 (VU 2 / 2 , 2 / 3 ),
357
( 1 - 1 )
358
VECTOR FIELD THEORY
[chap. 5
connecting tine variables y Xf y 2) 2/3 with three new variables x lt x 2) x 3 is
said to represent a transformation of coordinates . We shall suppose that
the functions Xi(yi f y 2 f yz) (i * 1 , 2 , 3 ) are single-valued and are continu-
ously differentiable at all points of the region R and that Eqs. (1-1) can
be solved for the 2/* to yield the inverse transformation
Vi * Vi(*i,* 2 ,*s)f
V 2 * 2 /a(*i,* 2 ,* 3 ), ( 1 - 2 )
VS - Vs(Xl>Z2j X 3)t
in which the functions 2/,*(x 1 ,22**3) are single-valued and continuously
differentiable with respect to the variables x t . The transformations ( 1 - 1 )
and ( 1 - 2 ) with these properties establish a one-to-one correspondence be-
tween the triplets of values (2/1 ,2/2, 2/3) and (21,22**3)- We shall term the
triplet of values (*i,* 2 ,2 3 ), corresponding to a given point P(t/i,2/2,2/3),
the curvilinear coordinates of P, and shall say that Eqs. ( 1 - 1 ) define a
curvilinear coordinate system xi,x 2 j x 3 . The reason for this terminology is
the following: If we set in ( 1 - 1 ) x x = cy ( a constant), the equation
*i(2/i, 2 / 2 , 2 / 3 ) * ci (1-3)
represents a certain surface S x . Similarly, equations
*2(2/1, 2/2, 2/3) * c 2 , ( 1 - 4 )
and *3(2/1, 2/2, 2/3) * c 3 , ( 1 - 5 )
represent surfaces S 2 and S3- These surfaces, shown in Fig. 2, intersect
at the point P whose cartesian coordinates (2/1 ,3/2, 2/3) can be obtained by
solving Eqs. ( 1 - 3 J to ( 1 - 5 ) for the y im
COORDINATES AND FUNCTIONS
359
SEC. 1]
The surfaces Si are called coordinate surfaces, and their intersections
pair by pair are coordinate lines x lf x 2 , x 3 . Thus, the x x coordinate line
is the line of intersection of the surfaces x 2 = c 2 and z 3 » c 3 . Along this
line the only variable that changes is x lf since x 2 = c 2 and x 3 = c 3 along
the line x x . Similarly, along the x 2 coordinate line the only variable that
changes is x 2f while along the x 3 line the only variable that changes is x 3 .
A very special case of the set of Eqs. (1-1) is
*i * Vu
*2 = 2/2, (1-6)
£3 = 2/3-
If we set x t *= Ci ( i = 1, 2, 3) in
(1-6), we get three planes yi = c t
perpendicular to the y coordinate
axes. These planes intersect at the
point (ci ,c 2 ,c 3 ) . The coordinate sur-
faces in this case are planes, and
their intersections pair by pair are
straight lines parallel to the coordi-
nate axes.
As a more interesting example
consider a transformation
yi « r cos 0,
2/2 = r sin 0, (1-7)
2/3 = 2,
which is of the form (1-2) if we set x x - r, x 2 = 0, x 3 = z. The inverse
of (1-7) is
r = 4- ^2/f + yl,
Vo
6 = tan""* 1 — * (1-8)
2/i
2 - 2/3,
and it is single-valued if we take 0 < 0 < 2ir and r > 0. The surface
r « ci is a circular cylinder y\ + y\ = c\ whose axis coincides with the
y z axis (Fig. 3). The surface 0 = c 2 is the plane y 2 = (tan c 2 )y x containing
the 2/3 axis, while the surface z « c 3 is the plane 2/3 = c 3 perpendicular
to the 2/3 axis. The r, 0, and z coordinate lines are shown in Fig. 3, and
we recognize that the curvilinear coordinate system r, 0, z is the familiar
system of cylindrical coordinates.
360 VECTOR FIELD THEORY
As a final example, consider the transformation
yi « p sin 6 cos
with the inverse
2/2 ** p sin $ sin <t>,
2/3 - P cos 0,
[chap. 5
(1-9)
0 = lan
<t> ~ tan"
+ j/3,
( 1 - 10 )
2/2
2/1
which is single-valued if we suppose
that p>0, O<0<7r, Q<<£< 2jr.
The transformation defines a
spherical system of coordinates.
The coordinate surfaces p = const,
6 - const, and <t> = const are, re-
spectively, spheres, cones, and
planes, shown in Fig. 4. The co-
ordinate lines are the meridians, the lines of parallels, and the radial lines.
PROBLEMS
1- Discuss the curvilinear coordinates determined by
V\ * 3*1 -f X2 -f *3,
V2 « Xi - *2 + * 3 ,
2/8 * 2jj + X2 - i*3.
2, Show by geometry that the coordinate lines in cylindrical and spherical coordinate
systems intersect at right angles.
2. Metric Coefficients. In this section we introduce an abridged no-
tation which will enable us to write many formulas compactly and without
loss of clarity. Thus, wc shall write the set of three equations of trans-
formation (1-1) in the form
x * * x t(y\>V 2 , 2 / 3 ), i = 1, 2, 3, (2-1)
aAd their inverse (1-2) as
Vi * V%(*i&&).
(2-2)
COORDINATES AND FUNCTIONS
361
SBC, 2]
Throughout this section we shall suppose that the Latin indices i } j, k
have the range of values 1, 2, 3.
If P(yuy 2 ,ya) is any point referred to a set of cartesian axes y (Fig, 5),
its position vector r can be written in the form
X — i 1 y l + i 2 y 2 + hyz, (2-3)
where the ij, i 2 , i 3 are the unit base vectors, which in Chap. 4 we denoted
by i, j, k.
The square of the element of arc ds along some curve C has the form
(ds) 2 = (</?/] ) 2 + ( dy 2 ) 2 + (# 3 ) 2 , (2-4)
and since
dr = ii dyi + i 2 dy 2 + 13 dy 3) (2-5)
we can write (2-4) as a scalar product
3
(ds) 2 = £ dy % dy x = dr-dr. (2-6)
t ==1
If we replace the y x in (2-3) by their values in terms of the xs with the
aid of (2-2), r becomes a function of the variables x % and we can write
dr
dr
dxi +
dxi
v dT *
2-
dx.
dr
dx 2
dx 2 +
dr
-—dx 3
dx 3
dt(xi,x 2 ,xa)
(2-7)
Now, the symbol
VECTOR FIELD THEORY
362
[chap. 5
denotes the derivative of r with respect to a particular variable Xi (i »
1, 2, 3) when the remaining variables are held fast. Thus, if we fix the
variables x 2 and x 3 by setting x 2 = c 2 and x 3 » c 3 , r becomes a function
of X\ alone, and hence the terminus of r is constrained to move along the
Xi coordinate line in the x coordinate system determined by Eqs. (2-1)*
Consequently, the vector
dr Ar
— = lim —
dxi a*, o AX!
is tangent to the coordinate line x x . Similarly, we conclude that the
vectors dr/dx 2 and dr/dx 3 are tangent to the x 2 and x 3 coordinate lines,
respectively (Fig. 5). If we denote these vectors by a„ so that
we can write (2-7) as
dr
3
dr = 23 a * dx,*
*~i
(2-8)
(2-9)
and hence Eq. (2-6) assumes the form
(ds ) 2 -
( 2 - 10 )
On expanding the scalar product in (2-10), we see that formula (2-10)
can be written as
3 3
(ds) 2 = 23 23 • SLj dxi dxj
*~1 ;-l
and hence, with g x j defined by
we can write it as
35 Qn,
3 3
(ds) 2 « 23 13 Qij dx % dxy.
i-i y-i
In expanded form this reads
(ds) 2 » 0ii (dxi) 2 + 012 dxi dx 2 + 0 i3 dx i dx 3
+ 021 dx 2 dxi + 022(dx 2 ) 2 + 023 dx 2 dx 3
+ 031 dx 3 dxi + 032 dx 3 dx 2 + gw(dx 3 ) 2 .
( 2 - 11 )
(2-12)
(2-13)
Since a, • a ; = a ; *a l , we see from the definition (2-11) that g t j = Thus
the quadratic differential form (2-13) is symmetric.
For reasons which will appear presently, the coefficients g X j in this
quadratic form are called metric coefficients. We shall see that they can
SEC. 2] COORDINATES AND FUNCTIONS 363
be computed directly from Eqs. (2-2) without first calculating the vectors
a,-.
The vectors a t , which were found to be tangent to the coordinate lines
Xi at a given point P, are called base
vectors in the curvilinear coordinate
system x. Any vector A with the
origin at P can be resolved into com-
ponents A i, A 2 , A 3 along the direc-
tions of the vectors fti, a 2 , a 3 (Fig.
6). Thus, the base vectors a, play
the same role in the system x as the
base vectors i x , I2, is do in the car-
tesian system y. It should be noted,
however, that while the magnitudes
and directions of cartesian base vec-
tors are fixed, the vectors a„ in gener-
al, vary from point to point in space.
From the definition (2-11) we see
on setting i = j » 1 that the length __
of &i is | ai| « V^n- Similarly, |a 2 | = V022 and |a 3 | * V033. These
vectors are orthogonal if, and only if,
012 * 021 = *a 2 = 0,
03i = 0i3 ~ ai*a3 ■* 0,
023 ~ 032 ~ a 2 *a 3 = 0 .
A curvilinear coordinate system for which these relations hold is called
orthogonal , and we note that in an orthogonal system the quadratic form
(2-13) has the structure
(ds ) 2 * gnidxi ) 2 + 0 22 (dx 2 ) 2 + 033(d^) 2 . (2-14)
To get at the meaning of the coefficients g n , 022, and g 33 , we note that
when an element of arc ds is directed along the x x coordinate line, dx 2 »
dx 3 = 0, since along the x x line x 2 and x 3 do not vary. Thus, (2-14) gives
in this case
(d«i) 2 « 0n(dx x ) 2 ,
so that ds x « V^gndx x . (2-15)
Thus, the length of the arc element ds x along thejr x coordinate line is
obtained by multiplying the differential of x x by *\/0ii- Similarly we find
that the differentials of arc ds, along the x 2 and z 3 coordinate lines are
ds 2 m dx 2 , ds 3 ■* dx a . (2-16)
364
VECTOR FIELD THEORY
{CHAP, 6
Since the ds,- and the dxi are real, we conclude that gu >0 f 022 > 0,
033 > 0. In orthogonal cartesian coordinates (ds) 2 is given by the formula
(2-4), and hence in such a system gu *= 022 = 033 353 h
An element of volume dr in general curvilinear coordinates is defined
as the volume of the parallelepiped
dr = |ax*a 2 x a 3 1 dx x dx 2 dx s (2-17)
constructed on the base vectors a,-. If the system is orthogonal, (2-17)
reduces to
dr = VV a6 r 23S33 dr; dx 2 dx 3 , (2-18)
as is immediately obvious from (2-15) and (2-10).
When a curvilinear coordinate system x is determined by equations of
the form (2-1), we can write the inverse transformation (2-2) as
Vk = yk{x l} x 2l xz) (2-19)
and deduce the metric coefficients g l} as follows: On differentiating Eqs.
(2-19) with respect to x % we get
But in cartesian coordinates
dy k = X) “ dz x .
j -1 OXi
ds 2 = X dy k dyk,
k^i
and the substitution from (2-20) in this formula yields 1
3 r ^ 3
d* 2 - z fr— d*]
£i L,-l "l toy ’ J
I «=: 1 1 '£ = 1 dXj y
On comparing (2-21) with (2-12), we see that
^dy k dy k . , , „ „
— M = 1,2,3.
*-l t)X,- toy
( 2 - 20 )
( 2 - 21 )
( 2 - 22 )
This is the desired formula for the calculation of metric coefficients.
To illustrate the use of (2-22) consider a coordinate system defined by
Eqs. (1-7), which we write in the form
1 Note that the summation index can be changed at will so that
SEC, 2 1
COOHMNATES AND FUNCTIONS
365
Vl m x x cos x 2 ,
y 2 * x x sin x 2 )
2/a 3=1 *3,
to agree with the notation used in this section. From (2-22) we have
-©+©+©■
~ cos 2 x 2 + sin 2 x 2 + 0 = 1 ,
Qn
*-©■+©+(£ )’
= xf sm 2 x 2 + *1 cos x| + 0 = x\,
933 —
© ,+ ©’ + ©'
\dXs/ \dxs/ \dx 3 /
0 + 0+1
1,
- f ^2 dj /2 ^ d //3
d:rj dx 2 dxi dx 2 dxi dx 2
- cos x 2 ( —Xx sin x 2 ) + sin x 2 (x { cos x 2 ) + 0 » 0.
We find in tiie same way that g 2 :\ = (j\n = 0. Hence the system under
consideration is orthogonal. The expression for ds 2 is
3 3
* 2 = S n g„ dx, dx,
.= 1 ;~1
= (dxi) 2 + x 2 i(dx 2 ) 2 + (fbr 3 ) 2 ,
which is a familiar formula for the square of the arc element in cylindrical
coordinates if we recall that x x = r, x 2 = 0, :r 3 — 2 . Since this system is
orthogonal, the element of volume is given by (2-18), which in our case
dr = rdrdSdz.
Example ; Obtain expressions for the elements of arc and volume in the coordinate
system x defined by
yx » x\ + X2 + a*,
j/2 » xi - 12 - £3, (2-23)
2/3 ** 2#1 +2*2 - X8,
and discuss the system.
On making use of formula (2-22) we find as in the preceding illustration that
011 ** fi , 022 3 , 088 ** 3 , 012 *• 2 , 028 ** l f 018 “* -“ 2 .
366
Hence
VECTOR FIELD THEORY
[CHAP. 5
ds* m 6(dxj )* 4 4 dx\ dx% — 4 dx\ dx$ 4 2 dxi dz% 4 3 (dxa)* 4* 3 (dz^) 2 .
The system is dearly not orthogonal, and to compute dr we shall make use of formula
(2-17). Now
r «* IiFi 4 ky* 4 lays
** ii(xi 4* x* 4- x$) 4- 4(xi — x* — £ 3 ) 4* 4 ( 2 xi 4* x* — x«)
and hence the base vectors a« — dr/dxi are
*1 “ U 4- h 4- 2ia,
a 2 « ij — i 2 4 - i 3 ,
Thus,
dt
«s « ii — is — 4.
|Ai*A 2 x As \ dx\ dX2 d>Xz
1
1
1
1
-1
-1
2
1
-1
dz\ dx% dx%
4 dx 1 dx 2 dz$.
On solving (2-23) for the x, we get
xi « Hyi 4- Mv 2 ,
x 2 - -J4yi - Mv* 4 Hi/s,
x 3 * %yi 4 J^j /2 ~ V 2 V a-
The coordinate surfaces x» » c* are planes, and the coordinate lines Xi are therefore
straight lines.
The system in this example is a special case of an affine coordinate system
determined by the transformation
Vi — cixiXi 4- a i2 x 2 4* i » 1, 2, 3, (2-24)
in which the a t y are constants. Affine transformations (2-24) occur in the
study of elastic deformations, in dynamics of rigid bodies, and in many
other branches of mathematical physics.
PROBLEMS
1. Discuss in the manner of the preceding example a coordinate system x determined by
1 1
w " Z^ Xl " V5**-
BEC. 3] COORDINATES AND FUNCTIONS 367
2. Compute the metric coefficients appropriate to a spherical coordinate system defined
by Eqs, (1-9), and thus show that
(d*)* - (dp) 2 + p 2 (dS) 2 + p 2 sin 2 0 (d 4 >)*
and dr ■» p 2 sin 6 dp d$ d<t>.
3. If R — ix 4- jy 4* k* is the position vector of a moving point P(z,y,z) in cartesian
coordinates, show that the unit base vectors e r , e$, e* in cylindrical coordinates (r,$ t z)
[see (1-7)1 are
e r «■ i cos $ 4* j sin 0, e# — i sin 0 4- j cos e, * k.
Show that R *• re r 4* ze gt compute dR/dt and d*Rfdt 2 , and thus show that the velocity
v and the acceleration a of the point P are
4. If R «• ir 4- jy 4* kz is the position vector of P(x t y,z ) in cartesian coordinates,
show that the unit base vectors e p , e*, e* in spherical coordinates defined by Eqs. (1-9) are
e p « i sin 0 cos <f> 4- j sin 0 sin <t> 4- k cos 0,
©$ i cos 0 cos <f> 4* j cos 6 sin <jb — k sin 0,
e* — i sin <*> 4- j cos <t>.
6. If the position vector R of a moving point P in spherical coordinates is written as
R ® pe p , where e p is the unit vector in the direction of the increasing coordinate p, use
the results of Prob. 4 to show that
dR dp d$ d<t>
3. Scalar and Vector Fields. Gradient If in some region of space a
scalar u(P) is defined at every point, we say that u(P) is a scalar point
function. An example of such a function is the temperature at any point
in a solid. A function v(P) defining a vector at every point P of the
given region is a vector point function. An example of vector point function
is the velocity at any point P of a fluid. The regions of definition of
scalar and vector point functions are sometimes called fields, and one thus
speaks of scalar and vector fields. Unless otherwise noted, we shall assume
that w(P) and v(P) are single-valued functions.
To facilitate calculations involving scalar and vector point functions, it
is often convenient to refer the region of their definition to a special co-
ordinate system x. If this is done, the coordinates of P can be denoted by
{x\,x 2l x z ) and u(P) and v(P) can be denoted by u(x u t 2 ,xz) and v{x u x 2 ,x z ),
respectively. As explained in the preceding section, v(x h x 2t x z ) can then
be represented in terms of its components v t (xi,x 2 , x z ) (i = 1, 2, 3) along
VECTOR FIELD THEORY
[CHAP. 5
the appropriate base vectors at It should be noted, however,
that the introduction of coordinate systems is a matter of convenience
and that u(P ) and v(P) depend only on the choice of P in the field and
not on any special reference frame selected to locate P. The fact that
scalar and vector point functions are independent of coordinate systems
is spoken of as invariance , and we shall see that it is possible to associate
with u(P) and v(jP) certain new scalar and vector functions which have
important invariant significance.
We say that u(P) and v(P) are continuous at P if
lim u(P') ** u(P) and lim v(P') « v(P)
p’ -+ p p' p
for every choice of P l in the neighborhood of P. Functions continuous
at every point of the region are said to be continuous in the region.
Fra. 7 Fig. 8
Let u(P) be a continuous scalar function in the given region. Wo
- >
select a point 0 in this region for the origin of position vectors r ss OP
— ->
If P r is some point in the neighborhood of P, we denote 0P f by r' and write
r-r + Ar.
The difference quotient
tt(P') - u(P )
|Ar|
u{P') - u{P)
As
(3-1)
where |Ar| * As f gives an approximate space rate of change of u(P), and
w r e can study the limit of (3-1) as P f is made to approach P along the
rectilinear path Ar. If this limit exists, we shall write
u(P') - u(P) du
lim = — *
Am ~+ 0 AS d8
(3-2)
COORDINATES AND FUNCTIONS
SEC. 3]
369
and call it the directional derivative of u(P) in the direction specified by Ar.
A different choice of P f yields a different vector Ar and in general a different
value for du/de at P.
A set of points for which u(P) has a constant value c determines a sur-
face S called a level surface; we assume that at each point of S there is a
uniquely determined tangent plane. Let us consider a pair of such sur-
faces S and S' determined by u » c and u = c + Ac, where Ac is a small
change in c (Fig. 8). If P is a point on S and P* on S', the change A u ss
u(P f ) — u(P) is Ac, and this is independent of the position of P* on S'.
But the average space rate of change
u(P') — u(P) A u
j Ar j ” As
(3-3)
clearly depends on the magnitude of Ar, The limit of this ratio as Ar is
made to approach zero by making Ac — » 0 is the directional derivative
(3-2) in the fixed direction determined by Ar. The greatest space rate
— ■>
of change of u will occur when P' is taken on the normal PQ as An to the
surface S (Fig. 8), since for this position of P' the denominator (Ar| in
(3-3) is not greater than | An | . Indeed,
An ~ Ar cos B,
where B is the angle between the normal PQ to 5 and PP
On taking account of (3-4), we conclude that
(3-4)
du 1 du du
— — H5 — sec Q.
dn cos B ds ds
(3-5)
The derivative du/dn in the directs Jhe normal to the level surface
u = const is called the normal derivative oj u(P).
If n is a unit vector at P, pointing in the direction for which Au > 0,
we can construct a vector, called the gradient of u , namely,
du
grad u SB n — (3-6)
dn
This vector represents in both the direction and magnitude the greatest
space rate of increase of w(P), provided, of course, that du/dn 0. The
gradient vector (3-6) is clearly independent of the choice of coordinate
systems and hence is an invariant. If we introduce the familiar cartesian
coordinates xyz and denote u(P) by w(:c,y,z), then, as in Chap. 3, Sec. 8,
du du dx du dy du dz
ds dx ds dy ds dz ds
(3*7)
870
VECTOR FIELD THEORY
[CHAP. 5
where dx/ds = cos (#,$), dy/ds * cos (y,s), dz/ds = cos (z,s) are the direc-
tion cosines of the unit vector s in the direction of the arc element ds
(Fig. 9). In this case the position vector r or P is
and
ix + )y + kz
dr
ds
. dx dy dz
i X + i T + k T'
ds ds ds
(3-8)
of s coincides with that of the normal
conclude that
We see that (3-7) can be written as
the scalar product of the vector
du du du
Vus i h j t-k —
dx dy dz
and the unit vector s in (3-8).
(3-9)
Thus,
— - VW'S. (3-10)
ds
Inasmuch as the greatest value of
du/ds is assumed when the direction
n to the level surface u * const, we
Vu = grad u,
(3-11)
for the right-hand member of (3-10) can be interpreted as the component
of the vector Vu in the direction s and the maximum component du/ds
is obtained when s is directed along Vu.
It follows from (3-9) and (3-11) that a formula for calculating grad u in
cartesian coordinates is
du du du
grad u = i hj b k — (3-12)
dx dy dz
On comparing (3-6) and (3-12), we see that
du
lgradu| * —
an
j/du\ 2 /du\ 2 ( du \ 2
VU + U' + U
Formula (3-9) suggests a definition of the differential vector operator
V, called del or ndbla ,
d d 0
V=i- -fj — + k— . (3-13)
dx dy dz
analogous to the scalar differential operator D introduced in Chap. 1,
Sec. 23. The product of V and the scalar u(x,y,t) is interpreted to mean
371
SEC* 3] COORDINATES AND FUNCTIONS
(3-9). The reader will show that
V(u + v) ** Vu + Vv,
V(uv) ® uVv + vVu ,
whenever u and v are scalar functions of ix^y^z). A formula for grad u
in orthogonal curvilinear coordinates is deduced in Sec. 13.
The directional derivative dv/d$ of a vector point function v(P) is
defined by formula (3-2) in which u(P) is replaced by v(P). When v(P)
is expressed in the form
v ** iv x 4" K + kv z , (3-15)
where 1, j, k are the base vectors in the system x,y,z ,
dv
ds
dv v
+ i~ + k
ds
dv t
ds
(3-16)
We have already employed a similar formula in Chap. 4, Sec. 7, to cal-
culate the derivatives of the position vector R = ix + jy + kz with respect
to the time parameter t.
Example 1. Find the directional derivative of u — xyz* at (1,0,3) in the direction of
the vector i — j 4- k. Compute the greatest rate of change of u and the direction of
the maximum rate of increase of u.
On substituting u m xyz 2 in (3-9), we find that the gradient u is given by
At (1,0,3)
Vu «* iyz 2 ~f jrz 2 -f k2xyz.
Vu « iO + j9 + k0 - 9j.
Thus, the greatest rate of change | Vu | *9, and the direction of the maximum rate of
change is along the y axis. Since the unit vector s in the direction of the vector i — j -b k
18
6
1
\/5
0 - J + k).
we find on using (3-10) that the desired directional derivative is
du 1 9
— - Vu-s - 9j • - j + k) - -
Example 2. Find the unit normals to the surface x 2 — y 2 -j- z 2 - 6 at (1,2,3).
The surface in this example is a level surface for the function u ■* x 1 — y % -f s*.
Since the gradient of u is normal to the level surface u ** const, we have by (341)
grad u «* Vu «« i2x — \2y + k2z.
which at (1,2,3) has the value
Vu ~ i2 - j4 + kfi.
VECTOR FIELD THEORY
872
[chap, S
But this vector Is directed along the unit normal n to u « x 2 — y % + ** *■ 6 in the direc-
tion of increasing u. Hence
Vu
n mt —
I VI* |
1
V5$
(12 ~ J4 + 2*6).
The direction of the other unit normal vector is opposite to this.
PROBLEMS
1. Compute the directional derivative of u * x 2 4* y 2 4* * a at (1,2,3) in the direction
of the line
x y t
3 " 4 ~ 5*
Find the maximum rate of increase of u at (1,2,3); at (0,1,2).
2. Find grad u if (a) u » (x 2 4* V 2 + z 2 )~ H t ( b ) u * log (x 2 -f y 2 + * 2 ).
8. Find the directional derivative of u * x 2 y — y 2 z — xyz at (1,— 1,0) in the direc-
tion of the vector i — j -f 2k.
4. Find the directional derivative of u » xyz at (1,2,3) in the direction from (1,2,3)
to (1,-1, -3).
5. Find the unit normal vector in the direction of the exterior normal to the surface
x 2 +2y 2 4-z 2 - 7 at (1, — 1,2).
6. Find the unit vectors normal to xyz » 2 at (1, — 1, —2).
7. Show that Vr n *» rar n " 2 r, where r « ix 4* jy + kx and r ** |r|.
8. Use the result of Prob. 7 to compute the directional derivative of u = (x 2 4- y 2 + z 2 ) s
at ( — 1,1,2) in the direction of the vector i — 2j k.
9. Compute the directional derivative of
v « i(x 2 - y 2 ) + j (xyz - 1) + k*
at (1*2,0) in the direction from (1,2,0) to (0,0,0).
4. Integration of Vector Functions. Integrals of vector functions with
the integrands consisting of scalar products of vectors are defined in the
usual manner. Thus if v(P) is a continuous vector point function specified
along a curve C joining a pair of points P 0 , P' } and if t{P) is the position
vector of P on C, then the integral
is defined as the limit of a sum constructed as follows. Let (7, which we
suppose to be sectionally smooth, 1 be divided into n arc elements As,-
by inserting the points Pi (Fig. 10). We form the sum
2>(P,)-Ar ( , (4-2)
1
1 This means that C consists of a finite number of segments with continuously chang-
ing tangents. The toxm piecewise smooth is also used.
COORDINATES AND FUNCTIONS
373
SRC. 4]
where Ar* = r* + i — r t *, and compute the limit of this sum as n — » oo
and every | Ar*| — > 0. The continuity of v(P) and the smoothness of C
suffice to show that the limit of (4-2)
exists, and we define the line integral
(4-1) to be this limit.
If v(P) is defined in some region
containing several paths joining Po
and P, then the integral (4-1) will
ordinarily have different values when
computed along different paths. In
exceptional circumstances, discuased
in the following sections, these values
may turn out to be equal.
If we introduce the xyz coordinate
system and write
v(P) = v{x,y,z) S w x (x,y,z) -f j v v (x,y,z) + kv z (x,y,z),
dr see i dx + j dy + k dz,
the integral (4-1) becomes
p f
, Mx,y,z) dx + v v (x,y,z) dy + v *(x,y,z) dz]. (4-3)
0
When the equations of C are given in parametric form
P'mP,
X — x(t),
y = y(t),
z = z(l) K
to <t <t‘
(4-4)
where the values to, t' of the parameter t correspond to the end points P 0, P r
rt '
of C, the integral (4-3) can be expressed as a definite integral 1 / F(i) dt and
JtQ
evaluated by the usual means.
Similarly, the surface integral
f v*n da (4-5)
where n is a unit normal specified at all points of a sufficiently smooth
surface 2 2, can be defined as the limit of the sum
1 See the examples at the end of this section. The equivalence of the integral (4-3)
and the ordinary Riemann integral, when (4-4) holds, is easily seen by comparing the
sums of which these integrals are the respective limits.
* W© assume that the surface 2 is two-sided and that n is directed toward one side.
This normal we elect to call positive. If the surface is closed, it is customary to regard
the exterior normal as positive.
374
VECTOR FIELD THEORY
[CHAP. S
k
lim £ v(P,)-n(P,) (4-6)
* * i-1
In constructing this sum it is supposed that the surface 2 is divided into
k elements of areas Acr* and P, is chosen somewhere in the element A<r x .
The limit is then computed by increasing the number k of elements in
such a way that the maximum diameter of every Acr,- approaches zero.
Formally one is tempted to extend these “limits of the sum" definitions to such
symbols as
*(/>)*, f z <P)<b, l HP) dr, (4-7)
in which v(P) is a vector function and ds, da, and dr, respectively, are the elements of
arc length, surface, and volume. Thus, there is a temptation to define the volume
integral jfv(P) dr by the formula
r *
/ v(P) dr « lim £ v(P.) Ari, (4-8)
Jr k~* « t_l
in which it is imagined that the volume r is divided into elements of volume At*. A
k
definition such as (4-8) requires forming sums X) v(P*) At* of the bound vectors v(F\)
t-i
which are determined at different points of the body. There is a question if the rules
for addition of free vectors given in Chap. 4, Sec. 2, can be used to provide a sensible
definition of (4-8). Without going into details we state it as a fact that the definition
(4-8) makes sense in those geometries where the distance between a pair of points is
given by the Pythagorean formula. 1
If v(P) is expressed in terms of its cartesian components as v » iv z (x,y,z) -f }v v (x t y,z) *f
k v»(x,y,z), the integrals in (4-7) can be reduced to the evaluation of three ordinary inte-
grals by writing, for example,
j V(P) dr » ij V z dr *f )J Vydr + kj v,dr.
No such simple means of evaluating integrals of the type (4-7) are available in curvilinear
coordinates because the base vectors in curvilinear coordinate systems vary from point
to point in space. This remark may serve to explain why cartesian coordinates are so
prominent in calculations involving vectors,
line integrals of the form
f [P(x,y,z) dx + Q(x,y,z) dy 4* R(x,y t z) dz] t (4-9)
Jc
which is identical with (4-3), are frequently defined without reference to vectors, but as
we shall see, the definition adopted here has many interesting and immediate physical
interpretations.
1 Spaces so metrised are called Euclidean, and it is only with such that we are con-
cerned in this book.
375
BBC. 4 } COORDINATES AND FUNCTIONS
Example 1 . Evaluate the integral / t*dx when C is the helical path
Jc
x •» cost,
V ~ aia (4-10)
* -*»
joining the points determined by t « 0 and f « r/2 and also when C is the straight line
joining these points.
Since r ** ix -f ft/ -f* k*, we get, on using (4kl0),
r « i cos t -f j sin t -f k*,
dr «* ( — i sin t -f j cos t -f~ k) dt.
r r* 12 w*
Hence / r*dr « / fdf «» — • (4-13)
Jc Jo 8
If the path C is a straight line joining the same points (1,0,0) and (0,l,ir/2), we can
write its equation in vector form as
r * n *f (r* - riX (4-12)
where ri and ra are the position vectors of (1,0,0) and (0,1 , t/ 2), respectively (Fig. 11).
The parameter t clearly varies between 0 and 1, since for t — 0, (4-12) yields r ■■ n
and for t «« 1, r * r 2 . But n ■■ i, r* «- j + (x/2)k, so that (4-12) reduces to
r “ i + ( j+ i k " 1 ) < -
Hence / c r-dr “ £ [ + ( 2 + j) *] * “ J*
This is the same value as we got for the helical path. In the following section we shall
•ee why this particular integral is independent of the path.
VECTOR HELD THEORY [CHAP. 5
Example 2. Compute the value of / v»dr, where v *■ iy -f j2ar and C is the straight
Jc
line joining (0,1) and (1,0). Discuss also when C is the arc of a circle centered at the
origin.
* ( JSkice r ** ix -f jy, we have dr « i dx + j dy and therefore
f v 'dr * f (y dx -f 2x dy). (4-13)
* rJ ,fc > Jc Jc
To evaluate this integral along the rectilinear path in Fig. 12, we write the equation of
the path in the form
y «* _J_ 1 (4-14)
and insert (4-14) in (4-13). Since dy » -dx, we get
f vdr * f [(— x 4* 1) dx — 2x dx] * f (1 — 3x) dx — — H.
1 ' t Jc Jo Jo
, yhe 1 iptegr%tioa here is performed so that the path C is traced from the point (0,1) to
To compute the value of the integr al (4-1 3) over the circular path C f joining the
same two points, we note that y ** y/\ — x % along C", dy * — x dx/y / 1 — x a , so that
■L
1 — x 2 dx
2x 2
1 1 -3x*
y/l z 2
dx
VT
w
~ 4*
Again, the path C" is traced out from (0,1) to (1,0). If the direction of description of
C’ is reversed, so that the circle is traced out from (1,0) to (0,1), the limits in the inte-
gral must be interchanged and we get -f t/ 4 for the value of the integral,
n Example 3. Evaluate / v*dr, where v « (iy — jx)/(x 2 -f y 2 ) and C is the circular
P i it ft v , Jc
path** + p* •* 1 described counterclockwise.
COORDINATES AND JUNCTIONS
SEC. 4]
377
This integral c an be evaluated as in the preceding example by substituting in the
integrand y ** s/\ — x % for points on the upper half of the circle O' and
on the lower half. It is simpler, however, to write the equations of the path in para-
metric form , rf iu
We thus get
X ** cos 0, )
} 0 < e < 2t.
y «* sin 0 ,J
t*,.
(4-15)
V*'Oi i*U) k ’i V * r L
t •* ix •+■ jy ■* i cos $ 4* j sin & f
dr « ( — i sin 0 + j cos 8) dd$
v
i sin 0 — j cos $
sin 2 0 + cos 2 0
i sin 8 — j cos 0.
h:') k/'J
2 , ' * ti
Hence
sin 2 0 — cos 2 6) de
~2 v.
If the path is traced in the clockwise direction, we get +2*-.
It may prove instructive to evaluate this integral over the square C' formed by the
lines x « ±1, y * rhl (Fig. 13). t
The integral over C" is equal to the sum of four integrals evaluated otfCr the paths
PQ, QR , RS, SP.
Now along PQ, y * —1, dy «* 0, r ** Lr — j, dr * i dx, and v «« ( — i — ji)/(^ # + 1)’.
Hence
tan“
l
M
-1
X
2 *
Along the path QR, x « 1, r « i + jy, dr — j dy, v « (iy — j)/(l -f y*), so that
J v-dr « (
qr J - 1
l 4- y 2
In a similar way we find that
"pul Sum
% j •
/ v*dr *» / v*dr — —
•Jfls -w 2
so that the integral
2- ■ .!
This time we obtained the same result as we did for the circular path:. 'In? Seel 15 we shaft
see that this is not an accident and that the value of this integral for enfeqr
enclosing the origin is — 2*\
PROBLEMS
1, Evaluate the integral in Example 2 over the path C consisting o( straight-line
segments joining the points (0,1), (0,0), (1,0) in that order. ' *■' ' ^ "
2. Evaluate the integral in Example 1 over the polygonal path joining the points
(1,0,0), (1,1,0), (1,1, x/2) in that order.
378
VECTOB HELD THBOBY
[CHAP. 5
8. Compute the value of the integral Jj(xy dx — ydy + dz) over the following paths:
(a) Straight line joining (0,0,0) (1,1,1),
(b) Straight line joining (0,0,1) and (0,1,1),
(c) Straight line joining (0,0,0) and (1,2,3).
Note that this integral has the form v-dr.
4u Compute the integral ^ v-dr where v «* lx — jy 4- kz over the helical path in
Example 1. Also evaluate it over the rectilinear path.
5. Compute the work W done in displacing a particle of unit mass in a constant
gravitational field F « -kg' along the following paths:
(a) Straight line joining (0,0,0) and (1,1,1),
(b) A polygonal path joining(0, 0,0), (1,1,0), (1,1,1) in thatorder. Hint:W » /V*dr.
JQ
8. line Integrals Independent of the Path. A special case of line integral
f c vdt ~f c [v x (x,y,z) dx + V v (x,y,z) dy + v,(x,y,z) dz], (5-1 )
in which v(x,y f z) is known to be the gradient of some single-valued scalar
specified in the region R containing C, frequently appears in ap-
plications. Now, if v « Vu, then
du du du
v*dr — Vu«dr = — dx -| dy H dz
dx dy dz
m du } (5-2)
and thus the integrand in (5-1) is an exact differential. We can, therefore,
write
f c v ' dT = fp 0 du=s ~ u ( p o), (5-3)
where Po and P are the end points of the path C.
This result is unique since u f by hypothesis, is single-valued. More-
over, since it depends only on the end points P 0 and P, we see that the
value of the integral in (5-3) is independent of the path joining these points.
If Ci and C 2 are two different paths shown in Fig. 14, then
/ Vu*dt ** f Vu-dr. ( 5 - 4 )
Jp o JPq
Ci c t
But along C 3>
SBC. 51 COORDINATES AND FUNCTIONS 379
and we can therefore write (5-4) as
rP rP«
/ Vu-dx + / Vu dt ** 0
JPt Jp
Ci c,
or fvudx = 0, (5-5)
where C is the closed path formed by Ci and <? 2 .
The results embodied in (5-3) and (5-5) can be stated as a theorem.
Theorem I. The hue integral \c Vu-dx, in which u is a single-valued
continuously differentiable function in a given region R, is independent of the
path , and hence it vanishes for every closed path drawn in R.
At first glance Theorem I appears to contradict the result in Example 3
of Sec. 4, where the integral J c v-dr with v = (iy — jx)/(x 2 + y 2 ) was
considered. It is easy to check that v = — V tan”” 1 ( y/x), so that in this
case u = — tan -1 (y/x). This integral does not vanish when evaluated
over any closed path including the origin because the function tan~ l (y/x)
is multiple- valued. Also, the continuity requirement of the theorem is
not fulfilled by v » Vu at (0,0).
We can also establish another important theorem which is a converse
of Theorem L
Theorem II. If a vector point function v is continuous in a given region R }
and if the integral j c v*dr is independent of the path, then a single-valued
scalar u exists such that v = Vu in R,
We shall prove this theorem by actual construction of the function
u(x,y,z) fulfilling the conditions of this theorem.
VECTOR FIELD THEORY
380 VECTOR FIELD THEORY [CHAP. 5
By hypothesis, the integral J c v-dr when evaluated over any curve
C joining Po(xo,yo,Zo) with P(x,y,z ) is independent of the path and thus
defines a single-valued function
u(x,y,z) = / (v x dx 4 - v v dy + v, dz). (6-6)
^(aco,j/o.*o)
We shall show that this function is, indeed, such that v = Vu.
On replacing x by x + Ax in (5-6), we get
r(x+Ax, y % z)
u(x 4- Ax, y, z) » / ( v x dx + v v dy + v 9 dz) (5-7)
J(XQ,VQ,Z0)
and on subtracting (5-6) from (5-7), we obtain
r(x+Ax, y, z)
u(x + Ax, y f z) — u(x,y,z) = / (t>* dx + v u dy + v 8 dz). (5-8)
J(x,y.z)
The integral in (5-8) is independent of the path joining (x,y,z) with
(x + Ax, y, z ), and it suits our purposes to evaluate it over the rectilinear
path y = const, z « const. Over such a path dy « dz « 0, and hence
(5-8) yields
rx-t-Ax
u(x + Ax, y, z) - u(x,y,z) = / v x (x,y,z) dx. (5-9)
But by the mean-value theorem for integrals
J «x(x,y,z) dx = v x (£,y,z) Ax (5-10)
where a; < £ < a: 4- Ax. The substitution from (5-10) in (5-9), on dividing
by Ax, gives
u(x + Ax, y, z) - u(x,y,z)
' = v x ((,y,z).
Ax
Now, if we let Ax — * 0, we get
du
— “ v x{x,y,z)
dx
(5-11)
by recalling the definition of partial derivative and by the fact that v x
is continuous. In a similar way we prove that
du
— = v v {x,y,z)
dy
(5-12)
du
and — = v t {x,y,z).
dz
(5-13)
SUC. 5 ] COORDINATES AND FUNCTIONS 381
But the statements (5-11) to (5-13) are equivalent to the vector equation
Vu « v, and the theorem is thus proved.
It should be carefully noted that the key hypothesis which ensured the
existence of a single-valued function u such that v « Vu is that the integral
j c v*dr is independent of the path. The integrand v*dr « v x dx + v v dy
+ v t dz may be an exact differential of a multivalued function u f in which
case the integral f c V'dx may depend on the path.
A differential form
t; x (x,y,z) dx + v v (x,y,z) dy + t>*(x,y,z) dz, (5-14)
in which v X) v y , v t are continuously differentiable single-valued functions,
is said to be exact if
du du du
v x dx + v v dy + v t dz » — dx -j dy H dz, (5-15)
dx dy dz
where u is not necessarily single-valued. We can deduce a set of necessary
conditions for (5-14) to be an exact differential as follows: If there exists a
function u{x,y,z) such that (5-15) is true, then on setting x ~ const, y =
const, z = const, in turn, we get
( 516 }
du du du
v x 3 as f Vy = » Vg =
dx dy dz
Differentiating the first of Eqs. (5-16) with respect to y and the second
with respect to x, we get
dv z d 2 u dv v d 2 u
dy dx dy dx dy dx
But the mixed partial derivatives in these expressions are equal, since
dv x /dy and dv v /dx are continuous by hypothesis (see Sec. 2, Chap. 3).
Thus
dv x dv y
dy dx
In a similar way we obtain two more relations
dv y dv z dv e dv x
dz dy dx dz
(5-17)
(5-18)
The relations (5-17) and (5-18) give a necessary condition to be satisfied
by the functions v X) v Vf v t in (5-15) if that differential form is to be an exact
differential of some function u(x,y, 2 ). We shall see in Sec. 12 that these
conditions suffice to ensure the existence of a function u such that (5-14)
is equal to du. However, the conditions (5-17) and (5-18) do not guarantee
382
VECTOR HELD THEORY
[CHAP. 5
that u is single-valued. The question naturally arises : What supplementary
conditions must be adjoined to Eqs. (5-17) and (5-18) to ensure that u
is single-valued? A complete answer to this question is complex because
it depends not only on the differentiability properties of v X} v V} v t but also
on the geometry of the region in which these functions are defined. If
the region of definition of these functions is simply connected and suf-
ficiently regular to permit the use of certain integral transformation theo-
rems discussed in Secs. 9 to 11, then u(x,y,z) determined from the formula
u(x,y } z) ® / (v x dx + v v dy + v z dz) (5-19)
J(ZQ,VQ>tQ)
is single-valued. We describe these restrictions on the character of the
region in the following section.
PROBLEMS
1. Show that the integral r*dr is independent of the path and find its value when
computed over the rectilinear path joining (0,0,0) and (1,1,1). Hint: r*dr dO^r 2 ).
2. Show that ( y — x 2 ) dx -f- (x 4* y 2 ) dy is an exact differential du, and, find u(x,y).
3. Show that the conditions (5-17) and (5-18) for an exact differential can be written
in symmetric form as
i j k
A A A
dx dy dz
Vx Vy V x
4 . (a) Is yzdx zx dy xy dz an exact differential du ? If so, find u(x,y t z). (b)
Evaluate the integral (yzdx -f zx dy -f- xy dz) over the rectilinear path joining
(0,0,0) to a fixed point ( x t y,z ).
x y f
5. If v «* i — -f j -r , show that / v*dr * 0 for every closed path that
x 2 + y 2 x 2 -f* y 2 Jc
does not include the origin. What is the value of this integral over the circular path
V x v
y 2 =* 1? Find u such that du Vu*dr.
6.
7 . Find a function u such that
x
y — — JL i JL 1UU M/ OUVU UMr V U »
6. Compute / Vwdr where u « log (x 2 -f y 2 ) and C is the circle x 2 -f y 2 « 1.
Jc
du
-dx —
- dy if x 2 > y 2 .
8. Ifv
through (0,0,0).
1 t
V - , compute /
r Jo
(*.*.*) J
V - * dr over some simple path that does not pass
(*o.vo.*g) r
TRANSFORMATION THEOREMS
6. Simply Connected Regular Regions. The validity of several im-
portant theorems op the transformation of surface and volume integrals
presented in the following sections hinges on the regularity and connec-
TRANSFORMATION THEOREMS
SEC. 6]
383
tivity of domains of definition of functions appearing in the integrals.
A careful characterisation of such domains is extremely involved and is
quite out of place in this book, but in order to aid the reader in under-
standing the circumstances under which the theorems in question are valid 9
we give a qualitative discussion.
We shall say that a given region is connected if every two points of it can
be joined by a smooth curve that lies entirely in the region. A region is
simply connected provided that every simple closed curve 1 drawn in its
interior can be shrunk to a point by continuous deformation without cross-
ing the boundaries of the region.
Thus, the interior of a square is simply connected, but the interior of a
ring bounded by two concentric
circles C x and C 2 is not (Fig. 15) be-
cause a closed curve C surrounding
cannot be shrunk to a point with-
out crossing Also, the interior of
a sphere is simply connected, and so
is the interior of the region bounded
by two concentric spheres, but the
interior of a torus (an anchor ring)
is not simply connected. A region
that is not simply connected is called
multiply connected .
In dealing with bounded three-
dimensional regions we shall say that
the bounding surface S is smooth if
at each point P of the surface one
can erect a normal n(P) which changes continuously as P moves along the
surface. A surface that can be subdivided by smooth curves into a finite
number of pieces each of which is smooth is called sectionally smooth or
pieceunsc smooth. The surface of a cube is an example of a piecewise
smooth surface.
The surfaces which we shall consider have two sides, although not all
surfaces are two-sided. A one-sided surface can be formed, for example,
by gluing the ends of a long strip in such a way that the upper side of one
end of the strip is joined onto the under side of the other end (Fig. 16).
If two oppositely directed normals PN and PN r are drawn at any point P
of the surface, then the normal PN when carried along the path PABCP
will coincide with PN'. It may be noted that this surface has a simple
closed curve as its boundary.
1 We recall that a simple closed curve is a closed curve consisting of a finite number of
nonintersecting smooth curves.
384
VECTOR FIELD THEORY
[CHAT. 5
We shall suppose that all surfaces with which we deal are two-sided,
piecewise smooth, and such that for
some orientation of cartesian axes the
projections on the coordinate planes
consist of the interiors of simple closed
curves. Such surfaces we shall term
regular. If a region is a union of
finitely many regions each bounded by
a regular surface, it will also be called
regular.
Regions bounded by a cone, a
sphere, or a cube are regular simply
connected regions. The interior of a
torus is an example of a regular
multiply connected region,
7. Divergence. Let a continuously differentiable vector point function
v(P) be defined in a regular simply connected region R bounded by a
closed surface <r. The surface integral of the component of v in the direc-
tion of the exterior unit normal n(P) to a is called the flux of v over a.
Thus, the flux F is
F = £v*n d<r. (7-1)
When v is the velocity of an incompressible fluid, the scalar F represents
the amount of fluid issuing from a- per unit time. The points of the region
at which the fluid is generated are termed sources , and those where it is
absorbed are sinks. When the total strength of the sources is greater than
that of the sinks, the flux is positive; when the strength is less, the flux is
negative.
Consider, now, a volume element r containing within it a point P } and
denote the bounding surface of r by a. Then the flux of v over <r per unit
volume is
j vender
(7-2)
T
If we let the volume r shrink to zero in such a way that the maximum
diameter tends to zero, the quotient (7-2) will have a limit called the
divergence of v at P. We denote the divergence of v by div v(P) so that
|v-ni(r
div v(P) « lim —
t -> o r
This quantity is a measure of the strength of the source at P.
(7-3)
TRANSFORMATION THEOREMS
385
EEC. 7J
Inasmuch as the volume r is arbitrary, the existence and, indeed, the meaning of the
limit in (7-3) are not quite obvious mathematically. One may let r approach zero while
staying similar to itself, or one may let r become arbitrarily thin compared with its
length, and so on. It is tolerably clear when suitable restrictions are imposed that
all these processes yield a unique limit L, independent of the shape of r. Moreover,
the convergence is uniform in the following sense: Given any « > 0, there is a 8 > 0
such that
^ v • n d<?
L < «, (7 -&*)
T
provided the maximum diameter of r is less than 5. For rectangular solids r this fact
is established in the next few paragraph, though the proof in the general case is not
presented here.
To calculate div v in cartesian coordinates we consider & volume r in
the shape of a rectangular parallele-
piped with center at P(x,y,z) and
with edges Ax, Ay, A z (Fig. 17).
The flux of v over the surface of this
parallelepiped is easily computed.
Since v = iv x + + kv z , the nor-
mal component v*n of v over the
face A BCD is v T . Hence the outflow
over that face is (v x ) x+ i^^ x Ay Az,
where (v x ) x +m Ax the mean value of
v x over A BCD. Similarly, the out-
flow over the parallel face EFGH is
( — 0* )x-HAz Ay A z,
where the minus sign appears be-
cause the exterior normal to EFGH is — i and hence v*n = —v x .
Thus the net outflow over a pair of faces parallel to the yz plane is
*4 H At
A y Az ss v x Ay A z.
x — H Ax
Proceeding in the same way with the remaining faces we get for the total
outflow
x+HAz v+MAy *4HAi
I v*n do- = v x Ay Az -f v u Ax Az + v t Ax Ay.
x — M Az y— HAj/ at — HAj
z+H Az
z—HAx
Az
div v(P) * lim
V,
(7-4)
386
VECTOR FIELD THEORY
[CHAP. 5
as Ax, Ay f and A z approach zero in any manner. Now, the three limits in
(7*4) are the respective partial derivatives, so that we obtain the important
formula
div v(P)
dv x
dv,. dv g
— +
— H
dx
dy dz
(7-5)
The fact that the limits are partial derivatives is suggested by the definition of partial
derivative (cf. Chap. 3, Sec. 2), Further discussion is required, however, because the
functions v z , ty, v * are mean values. By the theorem of the mean [see Eq. (3-7), Chap. 3]
G(x + MAx, y h zi) - G(x - M Ax, yi, zi) „ ^
Ax
where Q\ stands for dG/dx and where $ is between x — H Ax and i + If Q\
is continuous, then
I <?i(4,J/i,*i) - Gi(x,y,z) |
is as snDall as we please provided only that
It - *1 < 'A &x,\v - Vi\ Ay,|* - *i| <H
with Ax, Ay, and Az sufficiently small Hence the mean value
Ay Az
/■* + K a* rV+M Av
/ / Gi(^yi,z\) dyi dzi
J z~H Ac Jy~ l A L
a Ay
is as close as we please to Gi(x,y,z), and the limit is therefore G\{x,y,z ). Applying this
result to (7-4) with G — v x gives dv x /dx for the first limit, and the others follow by
symmetry.
The analysis shows that we may let Ax, Ay, and Az approach zero in any manner.
For instance, if Ax 0 first, then Ay —* 0, and finally Az —* 0, the volume becomes
a plane, a line, and finally a point. On the other hand if we set
Ax ** ah, Ay » bh , Az =* ch
where a, b, c are constant, and let h — > 0, then the volume stays similar to itself. Not
only is the same limit obtained in all such cases, but the departure from that limit is
seen to be uniformly small, provided only that
max)|Ai|,|Avl,|A 2 | |
is small. The remarks made in connection with (7-3o) are thus verified in this case.
In terms of the differential operator
d d d
V = i — + j f-k —
dx dy dz
introduced in Sec. 3, we can consider a symbolic scalar product
V*v
/ d d d\
1 » 1 - + i + k r ) ' 0®* + K + kv *)
\ dx dy dzf
di)'x dt)y dv g
dx dy dz
SBC. 7] TRANSFORMATION THEOREMS 387
On comparing this with (7-5) we see that
divv=V«v. (7-6)
We can also define the Laplacian operator V 2 by the formula
and observe that if v = Vu y then
div Vu = V-Vu = V 2 u. (7-8)
Furthermore, if the symbol V x v is defined by the rule for computing
vector products, we get
It is worth observing that the condition V x v = 0 requires that each
component of the vector V x v be zero. We can therefore write Eqs.
(5-17) and (5-18) (which ensure the existence of a scalar u such that
v — Vu) in the compact form V x v = 0.
In Sec. 13 we shall deduce a formula for div v analogous to (7-5) when
the vector v is referred to an arbitrary orthogonal curvilinear coordinate
system. It is important to note that the definition (7-3) is independent
of the choice of coordinates, so that div v is an invariant.
Example ■ If v » i3x 2 4 j5 xy 2 4 k xyz*, compute div v at (1,2,3) and V x vat (x,y,z).
Since v x *» 3x 2 , v y ** 5 xy 2 , v x «. xyz the substitution in (7-5) yields div v * 6x 4*
10 xy 4* 3 xyz 2 . At the point (1,2,3) div v * 6 4 20 4 54 * 80. If v is interpreted as
the velocity vector of fluid particles, we conclude that the point (1,2,3) is a source of
the fluid.
To compute V X v we use formula (7-9) and find
V x v *» i(xz 3 — 0) 4 j(0 — yz z ) 4 k(5j/ 2 — 0).
Since this vector is not identically zero, we conclude that no scalar function u(x,y,z)
exists such that v » Vu.
VECTOR FIELD THEORY
[CHAF. 5
PROBLEMS
1 Find div v if (a) y « te -f )y -f kz t (b) v « i(x/r) + j(v/r) 4* kfc/r), where r «
Vs* 4 v 2 -f aft**#, (c) v « i(* - y) -f j(s - g) ± k(y - x).
%. Compute V 2 (I/r) and V 2 r, where r * Vx 2 -f ?/ 2 -f-z 2 .
3. Shovfthat (a) div (u -f v) * div u -f div v, (6) div (uv) — V»(mv) * Vu*t + wV*v,
(c) div (u K V) **■ V • x v) — v«(V x u) — u*(V x v).
4. Show't&at div (r * a) « 0 if r « ix 4 iv *f k* and a is a constant vector.
5. Find div (uv) if u *» x % 4" I/ 2 + s 2 and v — ir -f jy + k*. Also find div (Vu X v).
8. The Divergence Theorem. An
important relationship connecting
the surface integral (7-1) for the flux
of a vector field with the volume in-
tegral of its divergence is deduced in
this section. The resulting integral
transformation theorem, known as
the Gauss or divergence theorem, is
fundamental to all developments in
mechanics of continuous media.
Let a continuously differentiable
vector function v(P) be defined in
a regular simply connected region r
bounded by the surface <r. We sub-
divide r into k volume cells Ar t in
the shape pf rectangular boxes and parts of boxes (Fig. 18) and compute
the divergence
- i . . S^' n)da
. , , div v(P,) = lira — (8-1)
*’ ' Ar. ~+0 Ar t
for each cel) v Ar»» (The role of r and a in (7-3) is now taken by Ar t and
Affo.j Qn ^ajliiig the definition of limit, we can rewrite (8-1) in the form
/ V-ndor « (div v) t Ar, + e.-Ar*, (8-2)
where the h as A r x — >
form the sum ’ ' t
V ' 1 ‘ ' j: f v*n da
. JA<r i
0 and where (div v) t ss div v(P f ).
k k
= 2 (div v), An + a At,
l=»l lw»l
We next
(8-3)
over all the cells jpd observe that the surface integrals in (8-3) over the
interfaces of adjacent cells vanish, since the exterior normals n to the
common faces of ithe boxes point in opposite directions. Thus the surviv-
ing terms in the sum on the left in (8-3) correspond to surface elements
SEC. 8] TRANSFORMATION THEOREMS 889
belonging to the exterior surface a, and hence this sura is equal to /v-n
* . . r *"
The sura 23 (div v), Ar» approximates the volume integral / div v dr,
l-l * T
and indeed, the approximation can be made as close as we wish by suitably
decreasing the size (that is, the maximum diameter) of the ce$s. x T^e
k
sura of terras 23 & r % involves products of small quantities e* and Ar*,
» — 1
and it becomes arbitrarily small 2 in the course of the process described.
We thus conclude from (8-3) that < . .
J v*n da =* J div v dr.
(8-4)
The result embodied in this formula is the divergence theorem . This theott&ft
expresses certain surface integrals as volume integrals, and since it con-
tains no reference to any special coordinate system, the result is true in all
coordinate systems. In particular, if v and n are expressed in terms of
their cartesian components
n = i cos (x,n) + j cos (y,n) + k cos (z,n),
we can write (8-4), on recalling (7-5), as
r r(dv T dv v dv z \
/ [v x cos (r,n) + v y cos (; y,n ) + v z cos (z,ri ) ] da = / ( i 1 ) dr.
Ja J r\dx dy dz /
(8-5)
Example^ l. Verify the theorem (8-4) for v — i{xh) + j (y/r) -f k(z/r ), where r =*
vV 2 -j- y l -f z 2 and the region r is the sphere x 2 + y 1 -t~ z l < a 2 .
We readily find that
dv x r 1 — J 2 dv y r 2 — y 2 dv z r 2 — z 2 ( f
ox " '“T 2 ”’ Y y " ~ 7“' Yz “ “73“ '
80 that by (7-5) 3 ^ 2 __ ^ g
div v = — «
r r
Now j div v dr » j - dr 1 *
1 Of course, the number of cells k must increase without limit as this process is carried
out.
* This is true by virtue of the uniformity emphasized earlier [see (7-3a)]. Thus, given
« > 0, we can make the subdivision so fine that | <»| < « for all the €,s aft bncbi 1 In that
case ; ' *
|2« t Ar*| < eS A r t *=> eV, t
where V is the volume of the region.
in* f
I
VECTOR FIELD THEORY
390
[chap. 5
and it is easy to evaluate this integral in spherical coordinates, since in spherical coordi-
nates dr m r 3 sin 6 do d4 dr (see Prob. 2, Sec. 2). We have
f C 2 A* f r ^ 2
/ div v dr ** I ~ dr «* 8 I I / - r 2 sin 9 d$ d*t> dr ■» 4*-a 2 ,
Jr Jrr Jo Jo Jo r
On the other hand
L v - adv ~L
1 ‘da «• 47TO ,
since v*a « 1, for n « l(x/r) + j (y/r) + k (z/r) is directed along the radius of the sphere.
Example 2. Prove with the aid of the divergence theorem the relation
jvu dr — Jun d<r,
(8-0)
where u is a continuously differentiable scalar point function.
Now in cartesian coordinates
n * i cos ( x,n ) 4- j cos ( y,7i ) -f k cos ( z,n )
- i(n*i) + j(n-j) + k(n-k)
and
du da Ou
Vu • i f~j f-k— ,
dx djy dz
so that (8-6) is equivalent to the three equations
f ~dr » [ (iu)-n d<r,
Jr ax 4
jT“dr » JiM-nda,
lrTz dT ~ l ikU) ' adff -
But these are the special cases of formula (8-5) applied to vectors v » iu, v — ju, v ** k u,
and thus the correctness of (8-6) is established.
Formula (8-6) can serve as a basis for a definition of Vu in the form
Vu ■* lim
r—* 0
un da
J<r
(8-7)
analogous to (7-3).
PROBLEMS
X. Prove that /r*n d<r * 3r, where r is the position vector of a point on the surface
J<T
of a regular simply connected region of volume r. Hint: Apply the divergence theorem
to the surface integral.
% Compute I vender, where o is the surface of the cylinder x 2 -f y 2 =■ o 2 bounded
J<r
by the planes * « 0, z « h, and where v » ix — jy + kz.
SBC. 9] TRANSFORMATION THEOREMS 391
3. Find j r*n dor, where r is the position vector of points on the surface of the ellipsoid
(x 2 /a 2 ) 4* W/b 2 ) 4- (**. A* 2 ) - 1.
4. Find the value of / v*n d<r, where v » r 2 (L r -f jy -f k*), r 2 « x 2 *f y 2 4~ z 2 , and
J<r
a is the surface of the sphere x 2 4- V 2 4* 2 2 * a 2 - Compute the integral directly and also
with the aid of the divergence theorem.
6. If v *** Vu and V 2 u ** p, where p is a specified scalar point function, show that
it dff ^L PdT -
Hint: Recall that
du
~~ « Vu*n.
an
i Use the divergence theorem to show that
ir^ ( u Vv)dr ~ Ju Vv-nda.
Show that this equation can be written as
dv
J uV 2 V dr Ju-j- dcr — J Vti-Vv dr.
This important relation is known as Green's first identity.
7. Using Prob. 6 obtain the symmetrical form of Green’s identity, namely,
^(uV 2 !; — vV 2 u) dr * J (u -- — v d<r,
which is also known as Green's second identity . (It is assumed in this identity that both
u and v have continuous second derivatives.) Green’s identities are perhaps the most
frequently encountered transformation formulas in mathematical physics.
8. If the twice-differentiable function* u satisfies Laplace’s equation V 2 u «= 0, what is
f du
the value of / -~-d<r? Hint: Set v *» 1 in Green’s second identity, Prob. 7.
Jar dn
9. Green’s Theorem. Line Integral in the Plane. Because of the im-
portance in applications of line in-
tegrals defined over plane curves,
we deduce here a special form of
the divergence theorem commonly
called Green’s theorem in the plane.
Let a vector function
v = iv x (x,y) + }v v (z,y) (9-1)
with continuously differentiable
components v X) v y be defined in the
plane region R bounded by a simple Fra. 19
closed curve C (Fig. 19). If we con-
struct a right cylinder of height h with base R and apply formula (8-5)
[CHAF. 5
392
VECTOR FIELD THEORY
to the region r bounded by this cylinder, we get
I [«?* cos (x,n) + v y cos (t/,n)] da
*ir
(9-2)
The exterior unit normals n to the top and bottom bases of the cylinder
thus write (9-2) as
are k and — k, respectively, and
hence cos (.r ,n) ~ cos (y f n) = 0 on
the bases of the cylinder. The con-
tribution to the surface integral in
(9-2) from the bases is, therefore,
zero, and the integral need be eval-
uated only over the lateral surface.
The element of surface da of the
lateral surface is da — h ds , where
ds is the arc element of C f and the
volume element dr can be taken in
the form dr = hdx dy. We can
r f f / d v x dv y \
/ [r x cos (z,n) + v y cos (y,ri)]h ds= I 1 )hdy dx, (9-3)
J c JJ R\dx dy/
where n is the exterior normal to C. But from Fig. 20
cos ( x,n )
dy
ds
cos (y,n)
dx
ds
(9-4)
so that on dividing by h , Eq. (9-3) yields
lc {Vl dy ~ V » dx) = Hr (5 + S) dX dy ’
(9-5)
where in tracing C the region R remains on the left; that is, the path C
is described in the positive direction.
Formula (9-5) is Green's theorem in the plane. The function ~~v y (x,y)
is sometimes denoted by M(x,y) and v x (x,y) by N(x,y), so that (9-5)
assumes the form
S c t Mdx + N dy) “ “ // fi (~ ~ ~) d * d y- (9*6)
Our restrictions on v x and v y demand that M ( x,y ) and N(x,y) be continuous
and have continuous partial derivatives in the plane region R.
'We see that if dM/dy = dN/dx at all points of R, then J c (Mdx + N dy)
0 over every simple closed curve C drawn in R. Conversely, if the
SEC. 9] TRANSFORMATION THEOREMS 393
line integral in (9-6) vanishes for every simple closed path C in R, then
rr ( dM dN\
//«(■
for every region R. This enables us to prove that
dM dN
dy dx
at every point of R; for suppose that
dM dN
5 ^ 0
(9-7)
( 9 - 8 )
(9-9)
at some point P, and for definiteness let this difference be positive. Since
(, dM/dy ) — (dN /dx) is continuous, there is a small region R' including P
throughout which the integrand in (9-7) is positive. But this means that
the integral is also positive, and since (9-7) is known to yield zero for every
region R , we have a contradiction. Thus, the hypothesis (9-9) is untenable.
We summarize these results as a theorem.
Theorem. A necessary and sufficient condition for the line integral
j c (M dx + N dy) to vanish for every simple dosed path drawn in a simply
connected region R, where M f N, dM/dy, and dN/dx are continuous , is that
dM/dy =*= dN/dx at all points of R.
The vanishing of the integral
f (M dx + N dy) (9-10)
over every closed path is equivalent to the statement that this integral
is independent of the path, and it follows from Sec 5 that the expression
M dx + N dy is an exact or total differential du of a single-valued function
u(x,y) determined by the formula [cf. Eq. (5-6)]
u(x,y) = f ( >V) (M dx + N dy ). (9-11)
J (zo,yo)
We recognize condition (9-8) to be identical with (5-17).
The theorem (9-6) can be extended to suitable plane multiply connected
domains in the following way. If R is a doubly connected region bounded
externally by a contour C 0 and internally by a contour Cj (Fig. 21), we
introduce a “cut” C joining some point P 0 on C 0 with Pi on C\, The cut
C can be visualized as a slit in the region R , forming the boundary C + C 0
+ Ci of the slit region. The slit- region R is simply connected, and if we
394 VECTOR FIELD THEORY
apply formula (9-6) to it, we get
[chap. 5
* 4 ( Mdx + Ndy ) + f 1 (M dx + N dy)
JC o JPo
+ (£ (Mdx + Ndy) + f P °(M dx + Ndy ). (9-12)
J Ci Jp i
Fig 21 Fig 22
The arrows on the integrals in (9-12) refer to the direction of integration
rP , >0
along Co and Ci as shown in Fig. 21, and the integrals I and / are
evaluated along C in the direction indicated by the limits. Inasmuch as
rP i rP 0
Jp ~ ~~ J pi > Eq. (9-12) reduces to
-u
dM diY>
= (M dx + N dy) + <j^ (ilf da* + A <&/). (9-13)
An obvious extension of tliis result to the region 1? bounded externally
by Co and internally by n contours C t (Fig. 22) yields
4 (M dx + N dy) + X) 4 (M dx + N dy), (9-14)
J Co ztaszl j a
An important result follows directly from formula (9-14) if it is supposed
that continuously differentiable functions M and N are such that
dM dN
dy dx
(9-15)
SEC. 9 ] TRANSFORMATION THEOREMS 395
in the region R. If (9-15) holds in R f the double integral in (9-14) vanishes
and we get
<£ (Mdx + Ndy) « - £ <£ (Mdx + Ndy)
J c '° ^ Ci
« E<£ (Mdx + Ndy).
»-i JCi
Thus, the line integral over the exterior contour C 0 taken in the counter-
clockwise direction is equal to the sura of the line integrals over the interior
contours Ci taken in the same direction. In particular, if there is only
one interior contour C \ (Fig. 21), we conclude that
f (M dx + N dy) « [ (M dx + N dy). ( 9 - 16 )
•'Co JC\
This integral need not vanish. If, however, continuously differentiable
functions M and N are also defined in the region interior to C\ and satisfy
the condition (9-15) in that region, then the value of the integral
J Ci (M dx + N dy) is zero, inasmuch as the integral on the left in (9-16)
vanishes by theorem (9-6).
PROBLEMS
1. Show that the following integrals are independent of the path and find their values:
fO.V
(a) / f(x 2 *f y 2 ) dx -f 2 xy dy],
o.i)
rd.i)
(6) r V f rr+% dx + (Tt* dy ] 1
J (o.o) L(1 4~ x )* (J 4- x) 2 J
Ar/2,r(2}
(c) / ( y cos x dx 4~ sin x dy),
J ( 0 , 0 )
id)
X —1,
X 2 < 1,
r (2,3)
(e) / (x -f 1) dx + (y + 1) dy.
h i.i) r
2. Write each of the integrals in Prob. 1 in the form / v»dr, and determine u(x,y)
such that Vu ** v.
3 . Find the value of
— ydx
+
x dy
¥*)’
x 2 + y 2 ' x 2 + i
where C bounds the region interior to the circle x 2 -f y 2 ** 4 and exterior to the circle
x 2 4- v 2 « 1. What is the value of the integral (a) over the circle x 2 + y 2
the circle x 2 + y 2 » l?
4 . Compute the integral
4? ( b ) Over
!L
div v dx dy,
where v «■ ix + \y over the region R bounded by the circles x* 4* V 2 * 1 and x 2 4 - y 2 »■ 4,
VECTOR FIELD THEORY
[CHAP. 5
5. tJ«e formula (9-13) to evaluate the integral I (~y dx x dy), where C is the path
Jc
bounding the region R in Prob. 4. What is the value of this integral over the path
x l y % m 1? Over the path ar 2 -b y 2 «■ 4?
10. Curl of a Vector Field. We saw in Sec. 7 that with every contin-
uously differentiable vector function v(P) one can associate a scalar
div v(P) defined by the formula
j n«vd<r
divv(P) * lim — (10-1)
r — > 0 T
which has a simple physical meaning.
We show next that v(P) can also be associated with a vector field called
curl v, defined by an analogous formula
J n x v da
curlv(P) = lim (10-2)
r o T
We shall see that curl v(P) bears an interesting relation to the concept
of circulation in the vector field.
Let v(P) be defined in some regular three-dimensional region P, and
let C be a simple closed curve in R bounding a plane area A. At a given
point P of A we construct a unit normal v so directed that v points in the
direction of an advancing right-hand screw when C is traversed in the
positive sense (Fig, 23). We then construct a right cylinder of small height
h with elements parallel to v and with base A and denote its surface by cr
and its volume by r.
397
SEC. 10 ] TRANSFORMATION THEOREMS
Since v is a constant vector, formula (10-2) yields
J v-n x v da
vcurl v = lim — (10-3)
r — » 0 T
But along the bases of the cylinder v is parallel to the normal n, and hence
the triple scalar product vn x v vanishes over the bases. Accordingly,
the integral in (10-3) need be computed only over the lateral surface of
the cylinder. We can thus write
J vn x vhd8
v • curl v ® lim — » (10-4)
r o r
since da « h ds.
But vn x v * v*v x n by Chap. 4, Eq. (6-4), and v x n — t along C,
where t is the unit tangent vector to C. Thus the integrand in (10-4) can
be written
vn x vhde — v*v x nh ds = v-t h ds = hv^dr
where dr is the differential of the position vector r of a point on C. If
we further note that r — hA , we can rewrite (10-4) as
lc V ' dT
v • curl v = lim (1U-5)
a -+ o A
The line integral is called the circulation of v along C. If v
represents the velocity of a fluid, then v*dr = v*t ds takes account of the
tangential component of velocity v and a fluid particle moving with this
velocity circulates along C. A particle moving with velocity v-n normal
to C, on the other hand, crosses C. That is, it flows either into or out of
the region bounded by C ¥ . Hence formula (10-5) provides a measure of
the circulation per unit area at the point P. This formula can be used to
compute the cartesian components of the vector curl v by taking v suc-
cessively as the i, j, k base vectors and by evaluating the limit in the
right-hand member. It is somewhat simpler, however, to get the formula
for curl v in cartesian coordinates from the definition (10-2) with the aid
of the divergence theorem. 1
Now the components of n x v in cartesian coordinates are n x vi,
n x v*j, n x v*k. Consequently,
n x v * i(n x v*i) + j(n x v*j) + k(n x vk)
ss i(n*v xi)+ j(n-v x j) + k(n*v x k). (10-6)
1 Also the uniformity of approach mentioned in connection with (7~3a) will then yield
the same kind of uniformity for (10-2).
398 VECTOR FIELD THEORY [CHAP. 5
On inserting from (10-6) in (10-2) we get
J (n*v x i) da j (n-v x j) do-
curl v * i lim h j lim
r — ► 0 r t ► 0 T
J (n-v x k) da
+ k lim — (10-7)
r -* 0 T
But a comparison of the right-hand member of (10-7) with (10-1) enables
us to rewrite (10-7) in the form
curl v =5 i div (v x i) + j div (v x j) + k div (v x k). (10-8)
On inserting
v = iv z 4- j v y + kv z
in (10-8) we get
curl v =» i div (jv e — kv y ) + j div (ky x — iv t ) + k div (iv v — jv x ),
and a simple calculation making use of formula (7-5) yields the desired
result
/ dv z dVy\ ,/dv x dv z \ /dv v dv x \
cmU - ■ w ■ *) + ’ (* “ ta) + 1 (to - *)• <IM)
If we recall the expression (3-13) for the symbolic vector V, we can write
(10-9) compactly as
i 3 k
curl v =
d d d
dx dy dz
V X Vy * V Z
S5 V X V.
(10-10)
An analogous formula for curl v in orthogonal curvilinear coordinates
is given in Sec. 13, and several useful relations involving the use of the
curl operator are recorded in Prob. L
Example: Compute curl v if v « ixyz -f j xyz 2 + kx?yz.
The substitution of v x « xyz, v v — xyz 2 , v z » xhjz in (10-9) yields
curl v — i(x®« — 2 xyz) -f j {xy — Zx 2 yz) -f k(yz 2 — xz).
SBC. 10 ]
TRANSFORMATION THEOREMS
399
PROBLEMS
1. Show that under suitable hypothesis on continuity of the derivatives:
(a) curl (A -f B) * curl A -f curl B;
(b) div curl A «■ V»V X A *»* 0;
(c) curl curl A« V div A — V 2 A, where V 2 A « i V 2 A Z -h j V 2 A v -f k
(d) curl Vu ** V x (Vw) » 0;
(e) curl ( uA ) - V X (wA) * uV X A -f Vu X A;
(/) div (A X B) ** B*curl A— A-curlB;
(g) curl (A x B) - AVB - BV-A + (B-V)A - (A-V)B, where (A-V)B m C is
the vector with components
dB x dB x dBx
A*+A* + At ^
dx dy dz
4 . Let a rigid body rotate with constant angular velocity ft about some axis through
a point 0 in the body. If r is the position vector of a point P(x,y,z) relative to a set of
axes fixed at 0, the velocity v of F is v ® v 0 + fi x r, where v 0 is the velocity of 0 rela-
tive to some reference frame fixed in space (of. Sec. 8, Chap. 4). Show that curl v — 2ft,
so that the angular velocity ft at any instant of time is equal to one-half the curl of the
velocity field. Note that the velocity vo is independent of the coordinates (x,y, z) of
points in the body.
6. Show from geometrical considerations that the angle dd subtended at the origin by
an element ds of a curve is do (n«r/r 2 ) ds.
6. A solid angle w subtended by a surface o is measured by the area subtended by the
angle on a unit sphere S with center at the vertex of <*>. Show that
f 1 ,
(*> * — / n • V - d<r.
Ja r
where r is the position vector of points on <r measured from the vertex of u and n is the
unit normal to a. Hint * Apply the divergence theorem to a volume formed by the bundle
of rays issuing from the solid angle and by the areas cut out by these rays on S and on <r.
7 . Referring to Prob. 6, show from geometrical considerations that
do) ** — =- d<r.
r
VECTOR FIELD THEORY
400
[chap. 5
tL Stokes’s Theorem. This useful integral transformation theorem
enables one to reduce the evaluation of certain surface integrals to the
calculation of line integrals.
Let R be a three-dimensional re-
gion in which v(P) is a continuously
differentiable vector funotion and a
a regular open surface embedded
in R. We suppose that the edge
of cr is a simple closed curve C
(Fig. 24). Then it is true that
J n • curlvder =*^v*dr, (11-1)
where n is a unit normal to a and
the line integral over C is evaluated in the direction determined by the
chosen positive orientation of n.
To establish formula (11-1), which is known as Stokes s theorem , we
follow a procedure similar to that used in Sec. 8 to prove the divergence
theorem. We subdivide a into k approximately planar elements of area
A cr if each bounded by a simple contour C\ (say triangular) (see Fig. 24).
Then formula (10-5) with v replaced by n % and A by A <r % when applied to
the element bounded by C* yields
curl v{Pi) A <T{ = / v«dr + e» A^.
JCi
(H-2)
On summing these expressions over the entire surface cr we get
k k k
Y n i • curl v(P.) Act,- = Y / v-dr + Y «< A<r f .
JGi l«*I
(H-3)
But the line integrals in (11-3) when summed over the common bound-
aries of adjacent elements cancel out, since such boundaries are trav-
ersed twice in opposite directions. The surviving terms yield the line in-
tegral f c v*dr over the boundary C. If the number k of elements A <r t -
is allowed to increase indefinitely, so that the greatest linear dimensions
of the A<r* tend to zero, the sum on the left becomes the surface integral
j n * curl v d<r. The sum J2 e, A or* tends to zero as in the discussion of
J* i
(8-3). 1 Thus, formula (11-1) is correct. It should be noted that once a
positive direction for the normal n has been agreed upon, the positive
direction of description of the contours C *•, and hence of C, is determined
by the right-hand-screw convention.
1 That is, if all | are lees than «, this sum is less than %S, where S is the area of a.
SEC. 11] TRANSFORMATION THEOREMS 401
If ir is a closed surface, the sum of the line integrals over the contours
Ci is zero, and in that event jf n- curl v d<r « 0.
We note further that if curl v « 0 in R, then L v*dr = 0 for an arbitrary
rp J
closed contour C. Hence the line integral / v*dr is independent of the
path and thus defines a function u(P), such that du = v*dr. We can
show conversely 1 that if the line integral in (11-1) vanishes for every
closed path C in R, then curl v ~ 0 throughout R. Reference to (10-9)
shows that the condition curl v = 0 is identical with Eqs. (5-17) and
(5-18), ensuring that v*dr = du.
Example: Evaluate / n ♦ curl v da over the surface t ** -f-Va 8 x 2 — y 2 if
Jff
v » i2y — jx -j- k*.
The surface in this example is a hemisphere of radius a, and it is clear that
. X V z
n ■* l — j_j — f-k-! lz
r r r v
"" * y /*
where r * ix + jt/ -f kz is the position x \ 'y
vector for points on the hemisphere. We A
readily check that
dx dy dz
I 2 y —x z I
Hence f n • curl v da « — 3 / - da.
Jcr J<j a
This integral can be easily evaluated by
noting that (Fig. 25) da « sec y dx dy,
where y is the angle between the normal
n and the positive direction of the z axis
(cf. Chap. 3, Sec. 17). But from Fig. 25,
sec 7 sec 0 *** aft, so that
3 JJ dx dy
since the region of integration A is a circle of radius a. The reader will check this result
by taking da « a 2 sin 6 do d4> as the element of area of the surface of the sphere in
spherical coordinates.
To obtain the result (11-4) from Stokes's theorem (11-1) we compute / v«dr, where
JC
C is the boundary of the circle x z + y 2 * a 2 . Since dt » i dx + j dy -b k dz, we have
/ v»dr * / ( 2ydx-~xdy+z dz).
Jc Jc
1 See the corresponding discussion in Sec, 9.
402 VECTOR FIELD THEORY [CHAP. 5
But along C we have di =* 0, and the equation of C may be taken in the form
We thus get for (11-5)
x ® a cos </>,
y » a sin 4>> 0 <£ <f> < 2*.
2 r
a 2 ( 2 sin 2 <f> f cos 2 4>) d4>
— 3ir a 2 .
PROBLEMS
1. Show that for the special case of a plane region bounded by a simple closed curve
C, Stokes’s theorem reduces to Green's theorem (9-6).
2. If v * iy -b jz -f kx and a is the surface of the paraboloid * « 1 — x 2 — y*,
z > 0, compute / n * curl v d<r.
f
3. What is the value of the surface integral / n • curl v da if v « i y l -f- fay 4* kxar
J<r
and a is the hemisphere x 2 4* y 2 4- z 2 = 1, z > 0? Evaluate this integral directly and
by Stokes’s theorem.
4. Compute / v*dr if v « i(x 2 -f y l ) 4* ){x 2 -f z 2 ) 4- k y and C is the circle x L + y 2
Jc
4 in the plane z — 0.
IS. Prove that the area A of the plane region bounded by a simple closed curve C in
the xy plane is given by
A “ j fj x dy ~~ y dx)
when C is described in the positive direction. Hint: Use Green’s theorem (9-6).
6. Verify Stokes’s theorem if v » iy 2 ~f- j xy — k xz and <r is the hemisphere z =
x 2 - y\
ILLUSTRATIONS AND APPLICATIONS
12. Solenoidal and Irrotational Fields. Let a continuously differentiable
vector function v(P) be specified in a region R. If curl v = 0 at every
point of R f we say that v(P) is an irrotational vector field. If v(P) is such
that div v = 0, the field is said to be solenoidal . The importance of
solenoidal and irrotational vectors in applications derives from the fact
that every continuously differentiable vector function v(P) defined in a
regular simply connected region R can be expressed as the sum of two
vector functions, one of which is solenoidal and the other irrotational.
We do not prove this fact here because it depends on demonstrating the
existence of solutions of certain partial differential equations , 1 and it would
carry us too far in the study of potential theory. Accordingly, we limit
our discussion to proofs of two basic theorems concerned with solenoidal
and irrotational vector fields.
1 See Prob. 6. A discussion of the system of equations in question is contained in
M. Mason and W. Weaver, “The Electromagnetic Field/’ pp. 352-365, University of
Chicago Press, Chicago, 1932.
ILLUSTRATIONS AND APPLICATIONS
403
SEC, 12]
Theorem I, A necessary and sufficient condition that a continuously
differentiable vector v(P) be irrotational in a simply connected regular region
R is that v « Vu, where u is a single-valued scalar function with continuous
second derivatives .
We suppose, first, that v =* Vu; then
curl v * curl Vu *
i
j
k
a
a
a
dx
dz
du
du
du
dx
ty
Tz
as follows at once on expanding the determinant and noting the equality
of the mixed partial derivatives of u(x,y,z).
Conversely, if we suppose that curl v = 0 in R, then it follows from the
concluding paragraph of Sec. 11 that du = v*dr and hence v = Vu.
Theorem II. The continuously differentiable vector function v(P) is
solenoidal m a region bounded by a regular surface if } and only if, it is equal
to the curl of some vector w with continuous second derivatives.
Let us suppose, first, that v — curl w. Then
div v = div curl w as 0,
as follows from a simple calculation 1 making use of formulas (7-5) and
(10-9). Conversely, if div v = 0, we show that a vector w can be con-
structed such that v = curl w. It suffices to show that the system of
equations curl w ~ v, or
dw z dw y
dy dz
dw x dw z
dz dx
Vy,
( 12 - 1 )
dw v dw x
dx dy
has a solution for w x ,
w v , w t whenever
dv x dv v dv z
dx dy dz
0 .
(12-2)
1 See Prob. 16 , Sec. 10.
404
VECTOR FIELD THEORY
[CHAP. 5
We show how to construct one such solution in rectangular domains.
If we take te* «■ 0, then the second and third of Eqs. (12-1) require that
dw z
= -v y (x,y,z), = v,(x,y,z). (12-3)
dx dx
On integrating (12-3) with respect to x and treating y and z as constants,
we get
w,~ - [ v v (x,y,z) dx + <t>(y,z),
(12-4)
w v = I v,(x,y,z) dx + +(y, z),
Jx 0
where 4> and \p are arbitrary differentiable functions of y and z. If we
insert these solutions in the first of Eqs. (12-1), we get
But from (12-2)
so that (12-5) yields
. r^+^fV
^o\dV dz/
d<l> dip
(12-5)
dv v dv z dv x
dy dz dx
v * */
Jx
'x dv x d(j> d\p
dx - 1
dx dy dz
d<f> d\p
Vz(x,y,z) - v x (x 0 ,y,z) + — •
dy dz
( 12 - 6 )
( 12 - 7 )
This equation can be satisfied by taking f = 0 and
4>(y,z) = f v x (x 0 ,y,z) dy.
•'I/O
Thus, one solution of the system (12-1) and (12-2) is
w x = 0 ,
w v - T v t (x,y,z) dx,
JXQ
v>x "= - f v v (x,y,z) dx+ f v x (x 0 ,y,z) dy.
JXQ J VQ
The proof clearly indicates that w is not unique. Indeed, if we take w
with components given by (12-7) and add to it Vu> where u is an arbitrary
scalar function with continuous second derivatives, then
curl (w + Vu) = curl w
inasmuch as curl Vu se 0. 1
1 Conversely, if curl Wi « v, then curl (wi — w) » 0 and wi — w *• Vu by Theorem
I. Thus every solution wi is representable in the form w 4- Vu, where w is the par-
ticular solution found in the text.
ILLUSTRATIONS AND APPLICATIONS
405
SBC. 13]
We remark in conclusion that whenever the divergence and curl of a
vector function v are specified in the interior of a regular simply connected
region and the normal component of v is known over the surface bounding
the region, then there is just one vector function v satisfying these condi-
tions. This uniqueness theorem is important in many applications. The
reader may prove it by following suggestions given in Prob. 7 below.
PROBLEMS
1. Show that v ** i2xyz -f jx 2 z -f~ kx 2 y is irrotational, and find u(x } y,z) such that
v «* Vu.
2. Show that v » i(z — y) -}- j(x — z) + k (y — x) is solenoidal, and find w (x,y r z)
such that v ■» curl w.
1 Is v* i (y 2 4- 2 xz 2 — 1) + j2 xy + k2 x 2 z irrotational? If so, find u such that
v * Vu.
4. Is v » i(xh — 2 xyz) + K X V — 3 x 2 yz) -f k {yz 2 — xz) solenoidal? If so, find a w
such that v ** curl w.
6. Prove that v -* r n r, where r — ix 4~ -h kz, is irrotational. Ib it solenoidal?
6. Let w «* u + v, where u is irrotational and v solenoidal in a given suitably re-
stricted region R. Then there exists a vector q such that v ® curl q and a scalar 4>
such that u *» V<£. Show that 4> and q satisfy the following partial differential equations:
V 2 <#> * div w, V div q — V 2 q « curl w.
7. If v is a continuously differentiable vector function defined in a regular simply
connected region R bounded by the surface a and if
curl v - f(x,y,z), div v - g(x,y,z)
in R and v*n «* h{x,y,z) on cr, show that v is uniquely determined in R by these con-
ditions.
Outline of the Solution . Assume that there are two such vectors, v » Vi and v «. vj.
With w =» Vi — V 2 , show that there is a u such that w ■* Vu, and deduce V 2 u *= 0.
By applying the divergence theorem to the vector uVu, show that / (Vu) - (Vu) dr « 0.
Jr
Since (Vu)-(Vu) > 0, this integral can vanish only if Vu as 0.
13. 'Gradient, Divergence, and Curl in Orthogonal Curvilinear Coordi-
nates. In this section we record the expressions for the gradient, diver-
gence, curl, and Laplacian in orthogonal curvilinear coordinates. These
can be obtained from the definitions (7-3), (8-7), and (10-2) in a manner
so similar to that used to obtain formulas valid in cartesian coordinates
that we dispense with the details of calculations.
As in Sec. 1, we suppose that a transformation
Vi =* y%(x i,*a,sa), i = 1, 2, 3,
wherein the variables yi are cartesian, defines a curvilinear coordinate
system x. We suppose that the coordinates x, are orthogonal so that the
quadratic differential form (2-13) has the structure
(ds) 2 * gn{dxi) 2 + Qnidx*) 2 4-
406 VECTOR FIELD THEORY [CHAP. 6
We denote the unit base vectors along the coordinate lines by ©i, © 2 , ©a
and represent a vector v(P) in the form
V =« ©i^i + ©2^2 + ©3^3* (13-1)
The volume element dr formed by the coordinate surfaces x, — const
and x t + dx x ~ const (Fig. 26) has
the shape of a rectangular parallel-
epiped with edges 1 ds l = \/ y n dz u
Hence the areas da XJ of its faces are
da X2 * ^911922 dxi dx 2f
daiz = Vgngzzdxi dx 3i (13-2)
da 23 = ^ 22(733 dx 2 dx 3)
and its volume dr is
dr * Vg n g 22 g Z3 dx l dx 2 dx 3 . (13-3)
To compute div v we calculate the flux da over the surface of the
volume element dr and divide it by its volume (13-3). A calculation like
that performed in Sec. 7 yields the result
1 f d(vih 2 h 3 ) d(v 2 h t h 3 )
div v = 1 f
h\h 2 h 3 L dxi dx 2
d(vsh l h 2 y
dx 3 .
(13-4)
where h t V^.
A similar but slightly longer computation also yields the formula
, 1 f d(h 3 v 3 ) d(h 2 v 2 ) 1
curl v = ©x
h 3 h 2 L dx 2 dx 3 J
r <KMi)
d(h 3 v 3 y
1
+ ©3 T~7~
d(h 2 v 2 )
L dx 3
dxi -
hih 2
- dx x
dz 2 J
(13-5)
which can be written more compactly as
curl v =
1
hih 2 h 3
k iei
h 2 e 2
h 3 e 3
d
d
d
dx }
dx 2
dx 3
h i»i
h 2 v 2
h 3 v 3
( 13 - 6 )
1 See Sec. 2.
BBC. 13] ILLUSTRATIONS AND APPLICATIONS 407
Finally, the formula for the gradient of a scalar u(x 1 ,x 2 ,xz) ) as follows
from (8-7), is 1
Vu
©i du ^ ©2 du e 3 du
hi dxi h 2 dx 2 h% Oxq
(13-7)
Inasmuch as div Vu = V 2 u, it is easy to check that the substitution
of v = Vu in (13-4) yields
V 2 u
1
hih 2 h$
d /h 2 h% du\ d /hih-j
dxi \ hi dxi / dx 2 \ h 2
(13-8)
In cylindrical coordinates defined by the transformation
x ~ r cos 0,
y ~ r sin 0,
2 = 2 ,
the metric coefficients are 2
011 = 1 , 022 = r ~> 033 = 1 ,
so that hi = 1, h 2 = r, /? 3 = 1.
Accordingly, formulas (13-4) and (13-8) yield
1 <9 (rv r ) 1 dv$ dv z
div v = ( 1 *
r dr r <30 dz
1 d 2 u
+ ?h? +
d 2 U
a?’
where v = ri v r + 0i vo + kv z
r b 0 b k being unit vectors in the direction of increasing r, 0, and z (Fig. 3),
In spherical coordinates determined by
x = p sin 0 cos <j> }
y = p sin 0 sin <£,
z = p cos 0,
1 Henceforth we shall use the symbol Vu to mean grad u in curvilinear coordinates as
well as in cartesian.
* See See. 2.
VECTOR FIELD THEORY
408
[char. 5
hi « 1, h% ® p, A$ * p sin 0, as follows from Prob. 2 in Sec. 2. On making
use of (13-4) and (13-8) we find that in spherical coordinates
div v
1 d(p 2 r p ) 1 d(sin0tty) 1 dv+
dp
p sin B dB
+
p sin 6 d<l>
V 2 u
l V dp/ i \ de/
: h : — i b *
1
d 2 u
where
dp p 2 sin 0 dB p z sin 34 B d<t> 4
v — p x v p + diV 0 +
and pi, 8j, are the unit vectors in the direction of increasing coordinate
lines shown in Fig. 4.
PROBLEMS
1. Write out the expressions for Vu in spherical and cylindrical coordinates.
2 . What is the form of V 2 in parabolic coordinates for which*
(ds) 2 - (u 2 4 r?)[(dn) 2 + (dv) 2 ] 4 uVCd^) 2 ?
8. The force F per unit charge due to a dipole of constant strength p is
F » ri(2p cos 0/r 8 ) 4 0 L (p sin 0/r 8 ),
where r, 9 are polar coordinates. Compute div F and curl F.
14. Conservative Force Fields. In the concluding sections of this chap-
ter we illustrate the use of vector analysis in the treatment of several
problems drawn from mechanics, hydrodynamics, and the theory of heat
flow in solids.
When a particle of matter is displaced along a path C in a given field
of force F, the work W expended in moving it is determined by the integral
W^fv-dT. (14-1)
The integral (14-1), in general, will have different values for different
paths joining the same two points in the force field. If (14-1) is independent
of the path, the field F is said to be conservative.
We show next that the force field determined by Newton’s inverse-square
law of attraction is conservative. 1 According to Newton’s law a particle
of mass m located at a point P is acted on by a force F whose magnitude
is proportional to m and inversely proportional to the square of the distance
r from P to the center of attraction 0. Thus,
km
F- r„ (14-2)
r*
i A similar discussion applies to electrostatic force fields determined by Coulomb’s
law, innee the mathematical structures of Newton’s and Coulomb’s laws are identical.
ILLUSTRATIONS AND APPLICATIONS
SBC. HI
409
where r* is the unit vector directed from 0 to P. The positive constant
k is determined experimentally; it clearly depends on the choice of units
of measure of F. Physically the law ( 14-2) represents the force of attraction
of the mass m at P by a unit mass located at 0.
If we rewrite (14-2) in the form
km
F=-— r, (1 «)
where r = rr x , and insert it in the work integral (14-1), we get for the
work done in displacing the particle from P x to P 2 along the path C,
r km
W = Jc ~r*dr. (14-4)
But r*dr * ]^d(t*r) = r dr , so that we can write (14-4) as
The integral (14-5) is clearly independent of the path joining P x and P 2,
and if we denote r(P 2 ) hy r 2 and r(P x ) by r x (Fig. 27), we can write
The function u(P) in (14-6) is con-
tinuous at all points except when r = 0, and since div Vu = V 2 zi, we readily
find that the gravitational potential (14-6) satisfies Laplace's equation
V 2 u « 0
except when r = 0.
The gravitational potential at a point P due to a continuous distribu-
tion of mass of density p is defined by the integral
r kp dr
u(P) * / * (14-8)
J T T
where r is the distance from the element of mass dm ~ p dr to the point P.
410 VECTOR FIELD THEORY [CHAF. 5
The force of attraction of the unit mass located at P by the body is
determined by the formula F = Vu.
The study of the properties of the scalar function u(P) defined by
(14-8) is in the province of potential theory, and we shall encounter it once
more in Chap. 6.
Example: Let us compute the gravitational potential u(P) of a thin homogeneous
spherical shell of radius a at a point P whose distance from the center of the shell is R
(Fig. 28).
Fig. 28
The potential at P can be computed by summing potentials of the ring-shaped ele-
ments of matter bounded by the cones with the semi vertical angles 0 and 0 -f- do. The
area of the zone intercepted by these cones is 2wa sm 0 a do, so that
u(P) - [
Jo
'* kp2ra l sin 0 do
where p is the mass per unit area of the shell.
From the cosine law of trigonometry
and we can write (14-9) as
Vtf + H‘ L - 2 aft <
u(P) - 2 rkpa 1 f —j~=
Jo v or H
R 2 — 2 aR cos 0
if R > a,
-[ V(R+a ) 2
'(a - R ) s ), if R < a.
If P is outside the shell, R > a, and we have the result
Avkpa 2 kM
(14-10)
where M m Ara z p is the mass of the shell.
SEC. 15J ILLUSTRATIONS AND APPLICATIONS 411
When P is inside the shell, R < a, and we get
u(F) * 4 rkpa, (14-11)
a constant.
The result (14-10) can be stated as a theorem.
Theorem. The potential {and hence the force of attraction F ■* Vu) produced by a thin
spherical shell at a poml exterior to the shell is the same as if the mass of the shell were con*
centrated at its center.
The potential due to a solid sphere of constant density p at a point outside the sphere
can be deduced at once from (14-10) by supposing the sphere to consist of thin concen-
tric shells We conclude that this potential has the same form as (14-10) with M re-
placed by the mass of the sphere. Accordingly, the force of attraction produced by a
solid homogeneous sphere on a unit mass at a point P outside the sphere has the mag-
nitude kM/R 2 . This force is directed toward the center of the sphere
From (14-1 1) we see that the force of attraction at a point inside the shell is zero.
The integral (14-8) becomes improper if P is within the solid, for in that case, r =*
V(x - £)* 4~ (y — v) 2 4- (2 — f) 2 becomes zero when the integration variables (£,i?,C)
coincide with the coordinates (x t y,z) of P . However, the concepts of potential and gravi-
tational attraction can be shown to have a meaning even when P is a point in the
interior of a homogeneous solid. 1
15. Steady Flow of Fluids. Let C be a curve in the xy plane over which
a sheet of homogeneous fluid of
depth 1 is flowing. The lines of flow
of the fluid particles are indicated in
Fig. 29 by curved arrows, and we
suppose that the flow pat fern is
identical in all planes parallel to
the xy plane. A flow of this sort is
called two-dimensional.
The problem is to determine the
amount of fluid that crosses C per
unit time. We denote by v the ve-
locity of the fluid particles at a point
P on C and compute the volume dV Fig. 29
of fluid crossing an element dr of C
per unit time. Since the depth of the fluid is 1, this volume is equal to the
volume of the parallelepiped
dV = k-v x di,
where k is the unit vector perpendicular to the xy plane. The volume V
crossing C per unit time, therefore, is
V = [ k*v x dr.
h
1 See in this connection Sec. 20, Chap. 1, and I. S. Sokolnikoff, “Tensor Analysis,”
sec. 89, John Wiley & Sons, Inc., New York, 1951, where it is shown that the potential
u satisfies Poisson's equation V 2 u * —4 xp.
[chap, 5
412
VECTOR FIELD THEORY
But by Chap. 4, Eg. (6-2),
k*v x dt
0 0 1
V X Vy 0
dx dy 0
v x dy — Vy dx,
since v * ii>* + ]v v and dt « i dx + j dy.
Accordingly,
V
j (»Z dy - Vy dx).
(15-1)
If C is a closed curve and the fluid is incompressible, the net amount
of fluid crossing C is zero, because as much fluid enters the region bounded
by C as leaves it. Thus a steady flow of an incompressible fluid is char-
acterized by the equation
J c ( v x dy — v y dx) * 0, (15-2)
where the integral is evaluated over any closed curve C not enclosing the
points at which the fluid is generated or absorbed. But Eq. (15-2) implies
that — Vy dx + v x dy is an exact differential d^(x 7 y) of the function
Moreover, 1
*(x>v) =
r(X'V)
J (xa,yo)
(— Vydx + v x dy).
d *
dx
v y)
d *
v*,
and v x and v v satisfy the condition
d( — Vy) dv x
dy dx
(15-3)
(15-4)
(15-5)
throughout the region R in which (15-2) holds. Equation (15-5) is a con-
sequence of (15-2); it states, in effect, that there is no fluid created or
destroyed in the region R. For this reason it is called the equation of
continuity . Since
dv x dv y
div v 1 >
dx dy
we can write (15-5) in vector form as
div v * 0, (15-6)
which is consistent with the meaning attached to the symbol div v in
Sec. 7.
1 See Sec. 5.
ILLUSTRATIONS AND APPLICATIONS
SRC. 15]
413
The function ^(x,y) defined by (15-3) is the stream function, and the
tracks of the particles of fluid, or streamlines, are determined by the equa-
tion ^(x,y) *» const. The velocity field satisfying (15-6), we recall, is
said to be solenoidal. If the flow v is irrotational, then curl v **= 0 and
there exists a scalar function $(x,y) such that 1
(15-7)
(15-8)
The function $(x,y) determined by the integral
r(x>v) rP
$(x,y) « / (v x dx + v v dy)ss v-dr
•'(*«.yo) JPo
V =
V4>,
d$
— ,
Vy
dx
dy
is called the velocity potential because of the relations (15-8). We emphasize
the fact that the condition for the existence of $(x,y) is
or in scalar form
curl v = 0
dv z dv v
dy dx
(15-9)
If the flow is both irrotational and solenoidal, the relations (15-4) and
(15-8) hold and we conclude that
— « — • — = in R. (15-10)
dx dy dy dx
These are the celebrated Cauchy-Riemann equations which we shall en-
counter again in Chap. 7.
Furthermore, if div v = 0 and v is given by (15-7), we see that
div V4> - V 2 4> = 0. (15-11)
Thus, the velocity potential 4> satisfies Laplace's equation throughout any
region containing no sources or sinks.
On differentiating the first of Eqs. (15-10) with respect to y and the
second with respect to x and on equating d 2 $/dx dy to d 2 $/dy dx , we find
that the stream function ^( x,y ) also satisfies the equation
V 2 * = 0.
The practical importance of these results is stressed in Chap. 7, Secs. 19
to 21.
The foregoing considerations can be extended to the three-dimensional
flows as indicated in Sec. 17.
1 See Sec. 12.
414
VECTOR FIELD THEORY
[CHAP. 5
PROBLEMS
1. Show that the gravitational field determined by (14-2) is both solenoidal and
irrotational except at (0,0,0).
2. Show that the velocity field
. x __ y
V x 2 -f y 2 J x 2 4 y 2
is solenoidal in any region which does not contain the origin (0,0). Is it irrotational?
Verify that the velocity potential <I> =» log r « log ( t 2 4 y l ) and the stream function
m tan” 1 (y/x) ** 0. Compute the circulation around a circular path enclosing the
origin, and thus obtain a physical interpretation of results m Probs. 5 and 6 of Sec. 5
and Prob. 3 of Sec. 9.
8. Discuss a two-dimensional flow for which the velocity potential <t> « cx . What is
the stream function ^ for this flow? Plot the curves 4’ — const and & — const.
4. Discuss a two-dimensional flow for which the stream function is 4' « 2 xy. Find
the velocity potential <I>, and sketch the curves 4> - const and «= const.
6. If v and w are irrotational vector fields, show that v x w is solenoidal.
6. Show that the streamlines are orthogonal to the lines <f> » const.
7. Show that when the three-dimensional flow v is irrotational, the stroamlines satisfy
the equations
dx dy dz
v z v,
8. If the velocity potential of the two-dimensional flow is 4> « x 2 — y 2 , find v and
obtain the equations of the streamlines. Is this flow solenoidal? Is it irrotational?
9. Show with the aid of the Cauchy-Riemann equations that when the stream func-
tion ^(x ,y) is given, the velocity potential is determined by
<*>(*, y)
r(x, V ) v
f ( — dx~-~-dy\-
(*o.wo) dx '
19. Use the result given in the preceding problem to calculate «f >(x,y) if (a) Sk = x 2 —
&zy 2 , (6) ♦ * — y/{x 2 4 y 2 ), x ^ 0, y ^ 0.
16. Equation of Heat Flow. The following derivation of the Fourier
equation of heat flow illustrates admirably the use of the divergence
theorem in mathematical physics.
It is known from empirical results that heat will flow from points at
higher temperatures to those at lower temperatures. At any point the
rate of decrease of temperature varies with the direction, and it is generally
assumed that the amount of heat AH crossing an element of surface A<r
in At sec is proportional to the greatest rate of decrease of the temperature
u; that is,
du
AH == k A<r At —
dn
Define the vector q, representing the flow of heat, by the formula
q “ ( 16 - 1 )
SBC. 16] ILLUSTRATIONS AND APPLICATIONS 415
where fc is a constant of proportionality known as the thermal conductivity
of a substance. [The units of fc are cal/ (cm-sec °C).] The negative sign
is chosen in the definition because heat flows from points of higher tem-
perature to those of lower, and the vector Vu is directed normally to the
level surface u = const in the direction of increasing u.
Then the total amount of heat H flowing out in A* sec from an arbitrary
volume r bounded by a closed surface a is
r du
1 1 fc — da
ri/n.
I q*n da,
J a
since q*n - — kdu/dn by (16-1).
On the other hand, the amount of heat lost by the body r can be cal-
culated as follows: In order to increase the temperature of a volume ele-
ment by A u° y one must supply an amount of heat that is proportional to
the increase in temperature and to the mass of the volume element. Hence
du
A H *Ss c Aw p Ar « c — A t p A r,
dt
where c is the specific heat of the substance [cal/(g °C)] and p is its density.
Therefore, the total loss of heat from the volume r in At sec is
r du
' / — cp dr.
J r at
Equating (16-2) and (16-3) gives
[ q-nda = — /
J a Jj
Applying the divergence theorem to the left-hand member of (16-4) yields
r r du
/ div q dr = — / — cpdr ,
Jr at
and since q = — fcVw, the foregoing equation assumes the form
div ( — fcVw) + cp — dr
dt]
Now, if fc is a constant,
and (16-5) becomes
div (kVu) = kV 2 u
-kV 2 u + cp — )dr sO.
(16-6)
416
VECTOR FIELD THEORY
[CHAR. 6
Since this integral must vanish for an arbitrary volume r and the integrand
is a continuous function, it follows that the integrand must be equal to
zero, for if such were not the case, r could be so chosen as to be a region
throughout which the integrand has constant sign. But if the integrand
had one sign throughout this region, then the integral would have the same
sign and would not vanish as required by (16-6).
Therefore,
—kV 2 u + cp — = 0
dt
or
where
— - h 2 V 2 u, (16-7)
dt
h 2 ^~
cp
Equation (16-7) was developed by Fourier in 1822 and is of basic impor-
tance in the study of heat conduction in solids. A similar equation occurs
in the study of current flow in conductors and in problems dealing with
diffusion in liquids and gases.
It follows from (16-7) that a steady distribution of temperatures is
characterized by the solution of Laplaces equation
V 2 u - 0.
It was assumed in this derivation that, the body is free from sources and
sinks. If there are sources of heat continuously distributed within r,
then it is necessary to add to the right-hand member of (16-3) the integral
JJ{x,y,z,() dr,
where J{x,y,z,t) is a function representing the strengths of the sources.
The reader will show that in this case one is led to the equation
— = h 2 V*u + — .
dt cp
provided that the thermal conductivity of the substance is constant.
Thus the presence of sources leads to a nonhomogeneous partial differential
equation.
17. Equations of Hydrodynamics. Consider a region of space containing
a fluid, and let v denote the velocity of a typical particle of the fluid.
The amount Q of fluid crossing an arbitrary closed surface a drawn in
the region can be calculated by determining the flow across a typical
element A a of the surface <r. A particle of fluid is displaced in At sec
through a distance v At, and since only the component of the vector v
f
imC. 17] ILLUSTRATIONS AND APPLICATIONS 417
normal to the element Aa contributes to the flow across this element, the
amount A Q of the fluid crossing A a is
A Q & pv* n A <t At,
where p is the density of the fluid (Fig. 30).
The entire amount Q of fluid flowing out of the volume r, which is
bounded by a, in At sec is
Q « At f pv-ndo.
Jo
On the other hand, the quantity of the
fluid originally contained in r will have
diminished by the amount
for the change in mass in At sec is nearly equal to (dp/dt) At At, and the
negative sign is taken because p is a decreasing function of t.
Equating these two expressions for Q gives
/.
pv*n da
(17-1)
and the application
this equation yields
or
of the divergence theorem to the left-hand member of
Since the integrand is continuous and the volume r is arbitrary, one can
conclude that
dp
— + div (pv) * 0. (17-2)
This is the basic equation of hydrodynamics, known as the equation of
continuity. It merely expresses the law of conservation of matter.
It has been assumed that there are no sources or sinks within the region
occupied by the fluid. If matter is created at the rate kp(x,y,z,t), then the
right-hand member of (17-1) should include a term that accounts for the
increase of mass per second due to such sources, namely,
J kpdr .
418 VECTOR FIELD THEORY
In this event the equation of continuity reads
[chap. 5
dp
dt
+ div (pv)
fcp.
The constant of proportionality k is sometimes called the growth factor.
The density p(x,y,z,t) of the fluid at the location (x y y,z) of the fluid
particle depends on t explicitly and on x,y f z implicitly, since the particle
coordinates change with time as the particle is displaced. Thus,
dp dp dp dx dp dy dp dz
dt dt dx dt dy dt dz dt
(17-3)
In this equation, dp/dt means the rate of change of density as one moves
with the fluid, whereas dp/dt is the rate of change of density at a fixed
point.
Upon noting that
dx dy dz
i hj h k —
dt dt dt
dp dp dp
and Vp = i hj h k — *
dx dy dz
we can write the formula (17-3) as
dp
dt
dp
h v* Vp.
dt
Substituting from
(17-2) in (17-4) gives
dp
— — — div (pv) + v-Vp.
dt
(17-4)
(17-5)
But div (pv) = v* Vp + p div v (see Prob. 36, Sec. 7), so that (17-5)
becomes
dp
dt
— p div v,
1 dp
or div v ~ (17-6)
p dt
It is clear from (17-6) that div v is equal to the relative rate of change of
the density p at any point of the fluid. Therefore, if the fluid is incompres-
sible, the velocity field is characterized by the equation
div v « 0. (17-7)
If the flow of fluid is irrotational, then curl v « 0, and one is assured
SBC. 17] ILLUSTRATIONS AND APPLICATIONS
that there exists a scalar function $ such that
419
v = V<£.
Substituting this in (17-7) gives the differential equation to be satisfied
by $, namely,
d 2 <l> d 2 <t> d 2 <i>
„ a^
v 2 4> m 1 h
dx 2 dy 2 dz 2
0 .
(17-8)
The function 3> is called the velocity potential. A similar result was obtained
in Sec. 15 for the two-dimensional flow.
If the fluid is ideal, that is, such that the force due to pressure on any
surface element is always directed normally to that surface element, one
can easily derive Euler’s equations of hydrodynamics. Denote the pressure
at any point of the fluid by p; then the force acting on a surface element
Aa is —pn Aa> and the resultant force acting on an arbitrary closed surface
ar is
The negative sign is chosen because the force due to pressure acts in the
direction of the interior normal, whereas n denotes the unit exterior normal.
Let the body force, per unit mass, acting on the masses contained
within the region r be F ; then the resultant of the body forces is
f F p dr.
Hence, the resultant R of the body and surface forces is
R = / Fp dr — / pn dcr
Jr J a
= j Fp dr - J Vp dr>
(17-9)
where the last step is obtained by making use of (8-6).
From Newton’s law of motion, the resultant force is equal to
r d 2 t
R== h ^ dT > (17-10)
dt 2
where r ® ix + jy + kz is the position vector of the masses relative to
the origin of cartesian coordinates. It follows from (17-9) and (17-10) that
/ t (f p — Vp-pl^-O,
and since the volume element is arbitrary and the integrand is continuous,
420
VECTOR FIELD THEORY
[CHAP. 5
ft
' dt 2
F p ~ Vp.
( 17 - 11 )
This is the desired equation in vector form, and it is basic in hydro- and
aerodynamical applications.
In books on hydrodynamics, the cartesian components of the velocity
vector dx/dt are usually denoted by u, v, and w, so that
dx , dx dy dz
— ® xu + \v + kw ® l h j h k — •
dt dt dt dt
Since n, v t and w are functions of the coordinates of the point (x y y } z) and
of the time t, it follows that
ih
if
i
/du dudx dudy dudz ^
Kdt dx dt dy dt dz dtJ
+ j
(:
dv dv dx dv dy dv dz\
“j 1 1 ]
dt dx dt dy dt dz dt/
( dw dw dx dw dy dw dz
1 1 1 —
dt dx dt dy dt dz dt
:)■
du du du
dn
i dp
1 uH v +
* — w = F x —
—
dt dx dy
dz
p dx
dv dv dv
dv
l dp
1 uH v +
,
dt dx dy
dz v
p dy
dv) dw dw
dw
1 dp
1 u H v +
— w = F g —
—
dt dx dy
dz
p dz
Substituting this expression in (1 7-11) and setting F = iF x + j F v + k F e
ead to three scalar equations, which are associated with the name of Euler:
( 17 - 12 )
It is possible to show with the aid of these equations (and by making
»ome simplifying assumptions) that the propagation of sound is governed
ipproximately by the wave equation
d 2 S 2 ~
di? ~ a *'
tn this equation, a is the velocity of sound and s is related to the density
) of the medium by the formula
$ = — — 1,
Po
adhere po is the density of the medium at rest.
CHAPTER 6
PARTIAL DIFFERENTIAL EQUATIONS
The Vibrating String
1. Arbitrary Functions: One-dimensional Waves 425
2. Derivation of a Differential Equation 431
3. Initial Conditions 435
4. Characteristics 440
5. Boundary Conditions 442
6. Initial and Boundary Conditions 446
7. Dam pal Oscillations 449
8. Forced Oscillations and Resonance 451
Solution by Series
9. Heat Flow in One Dimension 455
10. Other Boundary Conditions. Separation of Variables 459
11. Heat Flow in a Solid 463
12. The Dirichlet Problem 467
13. Spherical Symmetry. Legendre Functions 471
14. The Rectangular Membrane. Double Fourier Series 474
15. The Circular Membrane. Bessel Functions 480
Solution by Integrals
16. The Fourier Transform 482
17. Waves in a Half Plane 486
18. The Convolution Theorem 488
19. The Source Functions for Heat Flow 491
20. A Singular Integral 493
21. The Poisson Equation 495
22. The Helmholtz Formula 499
23. The Functions of Green and Neumann 501
423
424
PARTIAL DIFFERENTIAL EQUATIONS
[CHAP. 6
Elliptic, Parabolic, and Hyperbolic Equations
24. Classification and Uniqueness 604
25. Further Discussion of Uniqueness 607
26. The Associated Difference Equations 610
27. Further Discussion of Difference Equations 612
28. An Example: Flow of Electricity in a Cable 614
29. Characteristics and Canonical Form 617
30. Characteristics and Discontinuities 619
Equations containing partial derivatives arise in many branches of
mathematical physics. Fluid flow, heat transfer, wave motion, electro-
magnetic theory, elasticity, quantum mechanics, nuclear physics, and
meteorology are but a few of the fields that involve a stud}^ of such equa-
tions. In this chapter we give representative examples, indicating some
of the more important methods of solution. In contrast to the theory of
ordinary differential equations, it will be seen that now the general solution
is seldom sought. The main problem, rather, is to find that particular
solution which satisfies the determinative conditions (the so-called initial
values and boundary values) of the specific problem in hand.
THE VIBRATING STRING
1. Arbitrary Functions : One-dimensional Waves. A partial differential
equation of order n is an equation containing partial derivatives of order n
but no higher derivatives. For example, each of the three equations
d 2 u 2 d 2 u du 2 d 2 u d 2 u d 2 u d 2 u
dt 2 Q dx 2 dt a dx 2 dx 2 + dy 2 dz 2 °
is a partial differential equation of order 2. In this chapter we shall often
use the subscript notation for derivatives, so that the foregoing expressions
can be written more briefly as
Un *= a 2 U xx , lit ^ CX U>xx ) Mix *4“ Uyy Hh (1-1)
A function u that satisfies a given partial differential equation is called
a solution of the equation. For example, the function
u = cos x cos at (1-2)
is a solution of the first Eq. (1-1), because (1-2) gives
u x ** — sin x cos at, u t « cos x(-~ a sin at),
u xx ** — cos x cos at, u ti « cos x( — a 2 cos at) « a 2 u m .
425
426
PARTIAL DIFFERENTIAL EQUATIONS
[chap. 6
The reader will recall that the general solution of an ordinary differential
equation contains arbitrary constants; for example, the general solution
of y ft + y » 0 is
y — Cx sin x + c 2 cos z ,
which has the arbitrary constants c x and c 2 . We shall see that many im-
portant partial differential equations have solutions which contain arbitrary
functions and, conversely, the elimination of arbitrary functions from a
given expression often leads to a partial differential equation.
As an illustration of this fact let
u * f(x + y), (1-3)
where / is an arbitrary differentiable function. If the argument of / is
denoted by s » x + y, then
u ~ f(z + y) « f(s)
and the chain rule 1 gives
du df ds df
-m
dx ds dx ds
Similarly, u y ~ f{s), and hence u satisfies
tl x “ U y
for any and all choices of the differentiable function f.
Conversely, let u(x,y) be a solution of (1-4). If we set a » x -f* V, then
u(x,y) » u(x, s — j) ^ U(x t s).
The chain rule gives
_ . dx
U x l'x — +
dx
ds
dx
v x + u.
(1-4)
and, similarly, u v » Substituting into (1-4) we get
r x + v 9 - r/„
which shows that U x ** 0. It follows that U is a function of a only,
V - f(s) - fix + y\
and hence the same is true of u. Thus, (1-3) follows from (1-4).
For an example containing two arbitrary functions, let
U =* /i(r) + f 2 (s), fi an d/ 2 differentiable, (1-5)
where r and $ are the independent variables. Then U r = /i(r), and hence
f/ ra ** 0. (1-6)
1 The reader may find it advisable to review Chap. 3, Sec. 4.
SBC. 1] THE VIBRATING STRING 427
(The reader can verify that also U MT = 0.) Conversely, from (1-6) we have
r (tfr) - 0
ds
so that U r is independent of s:
U r — h(r), a function of r only. (1-7)
If we write /i(r) = j h(r) dr, then Eq. (1-7) yields
« 0 ,
dr
so that U — /i(r) = / 2 ($), a function of s only. Thus, E7 has the form
(i-6).
An important example of the elimination of arbitrary functions arises
from the situation shown in Fig. 1. If ^ is time, it is seen that fi(x — at)
represents a wave form which propagates in the positive x direction with
velocity a and w r ith no change m shape, that is, with no dispersion. In a
similar manner, f 2 (x + at) represents a wave form which propagates in
the opposite direction with velocity a. The most general one -dimensional
w r ave without dispersion is a superposition of two such, namely,
u * fi(x — at) + f 2 (x + at). (1-8)
Suppose, now, that u{x,t) is given by (1-8), with f\ and f 2 twice dif-
ferentiable. If we set
x — at — r, x + at
then u ~ /i(r) + / 2 ($), and by the chain rule
du
u x = —
dx
*>
(1-9)
d(/i+/a)*r , d(f l +f 2 )ds
+
dx
dr dx ds
The reader may verify similarly that
= f\(x - at)(-a) + f 2 (x + at) {a).
fi(x — at) + / 2 (x + at).
{chap. 6
428 PARTIAL DIFFERENTIAL EQUATIONS
Differentiating again gives
u xx «■ fi(x — at) + fl{x + at),
u, t = f[{x — at) (—a) 2 +fi(x + at) (a) 2 ,
and hence « satisfies the partial differential equation
u t t = a 2 u xx . (1-10)
We show conversely that every solution u(x,l ) of (1-10) has the form (1-8) and thus
represents the superposition of two waves propagating with velocity a. The substitu-
tion (1-0) gives
u{x,t) * U(r,s)
so that, by using the chain rule as in the previous discussion,
u x » U r 4 u t —aUr 4 aU 9 .
Differentiating again yields
Uxx ** Urr 4 2 U r » 4 " U stf
Uu * aHlrr - 2a 2 Ur* 4 a 2 u* 9 .
If we substitute these values into (1-10) we get (1-6). As we have already seen, tins
ensures that U has the form (1-5), and hence u has the form (1-8).
Equation (1-10) is satisfied by the most general one-dimensional wave
motion with velocity a; and conversely, every solution of (1-10) represents
such a motion. For this reason (1-10) is called the wave equation . To-
gether with its analogues in two and three dimensions, (1-10) is an impor-
tant aid in the study of many vibration phenomena.
Example: Standing Waves. The motion given by
fi(x — at) « A sin k(x — at), A, k const, (1-11)
represents a sine wave of amplitude A and wavelength X = 2r/k, moving to the right
with velocity a. The period T is the time required for the wave to progress a distance
equal to one wavelength, so that X = aT or
Similarly, a motion described by
fc(x 4 -at) ** A sin k(x 4 at) (1-12)
represents a sine wave, of the same amplitude and period, moving with velocity a to
the left. The superposition of (1-11) and (1-12) gives
which becomes
u « A sin k(x — at) 4 A sin k(x 4 cU)
u *» (2 A cos hat) sin kx
(M3)
429
SEC. 1] THE VIBRATING STRING
when we recall the trigonometric identities
sin k(x =fc at) « sin kx cos kai dt cos kx gin hat.
The expression (1-13) may be regarded as a sinusoid sin kx whose amplitude 2 A cos kal
varies with the time ( in a simply harmonic manner. Several curves of (1-13) are sketched
in Fig. 2 for various values of t. The points nr /k remain fixed throughout the motion
and are called nodes. Although the result was obtained by superposing two traveling
waves, the wave form (1-13) does not appear to travel either to the left or the right,
and (1-13) is said to represent a standing wave
The number / of oscillations or cydcs made by the wave per unit time is called the
frequency. From the definition of the period T t it follows that / » l/T,
PROBLEMS
1. If u * f(y/x) with / differentiable, show that
XU x -f yUy = 0 for X 7* 0.
2. Show by direct differentiation that u ** sin kx sm kat satisfies the one-dimensional
wave equation for every choice of the constant k , and express this function in the form
( 1 - 8 ).
3. (a) By computing u x% u Vt and Uxy obtain a second-order partial differential equation
for u = /i(x)/ 2 (.v). (fi) Show that your result us equivalent to (log « 0, and explain,
4 . For many functions the chain rule applies even when the argument is complex.
Assuming this, show that
u « /i(x -f iy) -f/ 2 (x - iy), i 2 » -1,
satisfies Laplace's equation u xx -f Uyy » 0.
5. Lot f(x -|- iy) ** u(x,y) 4- w{z,y), w'here u and v are real. Using the chain rule,
show that u and v satisfy the Cauchy-Rieniann equations
u x « v y , Uy » —v x .
6. Show that u » f(ay — fix) satisfies
aUx 4“ 0
if / is differentiable and a, d are constant.
430
PARTIAL DIFFERENTIAL EQUATIONS
(CHAP. 6
D?
The Linear Equation with Constant Coefficients
7. The operators DJ and D* are defined by
d n ^ d n
dx n v dy n
and we agree also, for example, that
( otDz 4" 0D y )u ® aD x u -f* &D y u m ctUx 4* §tuy.
(a) If nii are constant use the result of Prob. 6 to solve each of the equations
(D x — rn\D v )u *0, (D x — m<iD v )u ** 0.
( b ) Show that both solutions obtained in (a) satisfy
(D x — miDy)(D x - m 2 D v )u - 0 (1-14)
Hint: Since m, are constant, Eq. (1-14) may also be written
(D x — m 2 D^(D z — miD y )u « 0.
(e) Deduce that a solution of (1-14), containing two arbitrary functions, is
u = Fi(y 4- mix) 4- /'Vy 4~ m 2 x), »n ^ m 2 ,
w * Fi(y 4- mix) 4- x F 2 (y + mix), m 1 * m 2 .
tfin*: Since the equation is linear and homogeneous, the sum of the two solu-
tions in ( b ) is again a solution. The result for mi * m 2 may be verified by
direct substitution.
[Similar results hold in general. The solution of
(D x — mD v ) r u ** 0
can be shown to be
u ** Fi(y 4- mx) 4- xF 2 (y 4- mx) 4 - ... 4 - x r - l F r (y 4 - mi)
and the solution for several such factors is obtained by addition (cf. Chap. 1, Sec. 21).
The process gives the “general solution” in that the number of arbitrary functions
equals the order of the equation.]
8. The fourth-order equation
dSl d 4 U d*u
dx A dx 2 dy 2 dy 4 ^
occurs in the study of elastic plates. Show that the general solution is
u - Fi(y - ix) 4~ xF 2 (y - ix) -f F z (y 4- ix) 4. xF^y 4- ix).
Hint: The equation may be written
(D\ 4- 2 DlD* 4 - Dt)u - 0
so that the decomposition into linear factors gives
(Dx 4- iDy)(D z 4- iDy)(D x - iDy}(Dx - iD v )u - 0 .
Use the result of Prob. 7.
431
SEC. 2] THE VIBRATING STRING
9. As in Probs. 7 and 8, solve:
(a) u xx - a 2 tf w - 0; ( b ) u xx 4* « 2^;
(0 Wxz 4* Uyv ** (d) Uxx + Uyy s* 2l4dey.
10. Consider the equation
U xx *f 4 Uxy 5 Uyy - /(X,J/).
(o) By the method of Prob. 7 obtain a general solution when / ** 0.
(b) By assuming u «* cy A , where c is a constant to be determined, obtain a par-
ticular solution when / « y 2 .
(c) Similarly, obtain a particular solution when / ** x.
(d) By addition of the results (a), (b), (c) obtain the general solution when / *»
3/ 2 + x.
11. As in Prob. 10, obtain the general solution: (a) 2 z xx + ^ " 1 ; (b) z xx — a 2 ^
« X 2 ; (c) Z xx -f 3 Try 4* 2 Zyy “T-f J/.
2. Derivation of a Differential Equation. Consider a flexible, elastic
string stretched between two supports on the x axis (Fig. 3). To obtain
a differential equation for the motion, let v (x f t) represent the vertical
distance from the point x on the x axis to the string at time £. We shall
apply Newton's law,
(Mass) (acceleration of center of mass) =» force, (2-1)
to the short piece of string between x and x + Ax.
The mass of the short piece is
Mass = p As
where p is the mean density and As the length. The vertical component
of acceleration for the short piece is
Vertical acceleration — — -
dt 2
if u is the height of the center of mass above the x axis. To compute the
vertical component of force we let T be the tension, and we introduce the
angle B between the tension vector and the x axis. By Fig. 3 the vertical
component is
Vertical force due to tension « (T sin B)
- (T sin $) .
432 PARTIAL DIFFERENTIAL EQUATIONS (CHAP. 6
If there is an additional vertical force F\(x,t) Ax due to other causes,
substituting into (2-1) yields
d^fl
p As— — (T sin 0) - ( T sin $) + F x (x f t) Ax.
dr x+Ax x
Upon dividing by Ax and letting Ax —> 0 we get
ds d 2 u d
P 7~ ~ (F s* n 0) + Fx(x f t) (2-2)
dx dt 2 dx
if the required derivatives are continuous.
To obtain a simpler equation, note that the definition of arc length
yields 1
ds r /du\ 2 i H
- *+UJ
and also 1
dx
~ 1
sin $ * tan 0(1 + tan 2 0) H = u x (l + u 2 ) H ~ u x ,
if Wfc <$Cl. Moreover, if the displacement u is small, we can consider
T « const. Substituting into (2-2) yields the approximate equation
pu tt = Tu xx + F x (x,t).
This in turn may be written
u tt = a 2 u xx + F(x,t), (2-3)
where a «= VT/p and F(x,t) — p~ l Fi(z 9 t).
Equation (2-3) will be considered in the sequel under the assumption that
p, and hence a, is constant.
When the force function F(x,t) is zero, the vibrations of the string are
termed free vibrations . By (2-3) the equation for free vibrations is
u tt ** a 2 u xx (2-4)
and hence the solution has the form (1-8). According to the discussion
in Sec. 1, the motion can always be regarded as a superposition of two waves
moving with velocity
a =
(2-5)
1 The symbol ^ (read “is asymptotic to") means that the ratio of the two sides tends
to I* A discussion of this useful notation is given in Chap. 1, Sec. 2.
•The fact that the string is flexible means that the tension vector is tangent to the
string, so that
du
dx*
tan B « slope of curve
THE VIBRATING STRING
SRC* 2J
433
in opposite directions* Later we shall determine the precise form of these
waves by considering the initial state of the string, that is, the state at
t « 0, together with the conditions at the end points, 2 = 0 and 2 « l
Inasmuch as the constant a in (2-4) involves only the ratio !F/p, two
strings may behave similarly even if made of different materials* For
example, a string with density 2 p under tension 2 T behaves like a string
with density p and tension T , since both yield the same value for a. An
equivalence of two different physical systems such as this is sometimes
called a principle of similitude.
The study of similitude belongs to an interesting branch of mathematical physics
known as dimensional analysis . Although a general development 1 will not be given
here, we shall describe the underlying idea as it applies to (2-5).
Equation (2-5) relates three quantities a, T, and p which are expressed in different
physical units. Iu the mks system *
f meters*]
f kilograms |
j" kilogram-meters 1
L second J
P L meter J >
L (second) 2 J
where the square bracket is used to indicate that the measuring unit, rather than the
value, is being described. The value of a for use in (2-5) is the number of such measur-
ing units, that is, the number of meters per second, and similarly for p and T.
If we decide to measure lengths in centimeters rather than in meters, then the value
of a will be increased by a factor 100. In other words, 100a cm per sec is the same as
a m per sec. Similarly T will be multiplied by 100, but p will be divided by 300, since
the length unit for p in (2-6) occurs in the denominator. (Indeed, p kg per m is clem ly
the same as 0.0 Ip kg per cm ) Hence when a string has a wave velocity a, density p,
and tension T in the old system (2-6), then the same string has velocity, density, and
tension
100a, 1007- (2-7)
in the new system. Substituting into (2-5) yields
floor
100 a -» -
\ p/100
which is consistent with (2-5), as it should be. One does not get a contradictory result
by measuring all lengths in centimeters rather than in meters.
When we change meters into centimeters, we divide the unit of length by 100. More
generally, one might divide the unit by an arbitrary positive constant a. The new
values of a, p, and T would be, respectively,
oa, aT, (2-8)
a
[compare (2-7)]. Similar changes may be made in the units of mass or of time. Equa-
1 The reader is referred to P. W. Bridgman, “Dimensional Analysis,” Yale University
Press, New Haven, Conn., 1931, and S. Drobot, The Foundations of Dimensional
Analysis, Studia Math., 14:84-99 (1954).
»The mks units for T can be found from Newton’s law (2-1), since T is a force.
434 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
tkm (2-6) remains seif-consistent under such changes, as the reader can verify* The
question arises: Is (2-5) the only functional relationship
a~f(p,T) (2-9)
which is consistent under such changes? If so, then we would have a proof of the func-
tional relation (2-5), assuming merely that there is a functional relation of some kind.
To investigate tins possibility, suppose (2-9) holds where / is an unknown function
and where a, T t and p stand for the numbers of their respective units of measurement in
(2-6). If the unit of length is divided by a, then (2-8) gives
«o
upon substitution iuto (2-9). Since a is arbitrary we may choose « ® p to find
pa »/(l,pT). (2-10)
If we now divide the unit of mass by / 3 , the value of a is unchanged but p and T be-
come ftp and pT, respectively [see (2-6)]. Substituting into (2-10) yields
Ppa ~fi\,p 2 pT).
Upon choosing p 2 «* (pT)~ l , we get
(pT)~V - /(!,«,
so that
( 2 - 11 )
where c ** /(1,1) is a constant, independent of a, p, and T.
Finally, if we divide the unit, of time by y, the new values of a, p, and 7' are given by
(2-6) as a/y, p, and T/y 2 . Substituting into (2-11) gives
a
y
which reduces to (2-11) again. Thus, no new information is obtained by changing the
unit of time, and the constant c in (2-11) cannot be found by dimensional analysis.
But we can determine c by considering the limiting case of small oscillations. The
partial differential equation (2-4) is then valid, and (2-5) shows that c * 1.
PROBLEMS
1. The displacement of a certain string is
w(s,0 - fi(x - at) +/ 2 (x -f at).
What is the physical meaning of the condition u(0,t) m 0? If u(0,t) m 0, express fi in
terms of / 2 , and thus deduce
u(x,t) « f % (x 4- at) - / 2 ( - x + at).
% (a) Fi nd/(x - at) when f(z) -(14 x 2 )~ l ; when /(a) *■ sin kx; wh mf(x) » e x . In
each case compute also fix — at). Hint'. Substitute x — at for x in the expressions for
/(x), (6) If u(x,t) * f(x - at) 4 -fix 4* at), find u(0,t), u(x, 0), and u({,l/o) for each f(x)
In part (a) of this problem.
SEC. 31 THE VIBRATING STRING 435
3. In the derivation of (2-3) we observed that sin 0 ~ u Xf but we used this result in
the form (sin 0) x ~ (u x ) x . (a) By differentiating the exact formula
sin 6 =* u x { 1 + u?)~^
with respect to x , show that (sin 0) x ~ u xx is correct provided v xx is bounded. Also
show that the error is of the order of u\ in this case. (6) By considering u 1 -f- x,
v ■« 1 + 2x near x «* 0, show that the equation u ~ v does not always enable us to
conclude u x ~ v x . (In other words: If two functions approximate each other the de-
rivatives need not approximate each other and a separate investigation must be given,)
4 . Show that the small longitudinal vibrations of a uniform long rod satisfy the dif-
ferential equation
A JfA
dt* * P dx 2 ’
where u is the displacement of a point originally at a distance x from the end of the rod,
E is the modulus of elasticity, and p is the density. Hint: From the definition of Young’s
modulus E , the force on a cross-sectional area q at a distance x units from the end of
the rod is Eq(du/dx) t since du/dx is the extension per unit length. On the other hand,
the force on an element of the rod of length Ax is pq Ax d^d/dt 2 .
5 . If the rod of Prob. 4 is made of steel for which E - 22 X 10 8 g per cm 2 and whose
specific gravity is 7.8, show that the velocity of propagation of sound in steel is nearly
5.3 X 10 5 cm per sec, which is about sixteen times as great as the velocity of sound in
air. Note that in the cgs system E must be expressed in dynes per square centimeter.
6. Show that the differential equation of the transverse vibrations of an elastic rod
carrying a load of p(x) lb per unit length is
El
dx*
p{x) - m
d*y
dt 2 ’
where E * modulus of elasticity
/ = moment of inertia of cross-sectional area of roil about a horizontal transverse
axis through center of gravity
m ~ mass per unit, length
Hint: For small deflections the bending moment M about a horizontal transverse axis
at a distance x from the end of the rod is given by the Euler formula M ** El d 2 y/dx 2 ,
and the shearing load p(x) is given by d(*M/dx* — ;>(x)
3. Initial Conditions. In the previous section the wave equation
u tt *= a 2 u xx , a — const, (3-1)
was derived for small displacements of a uniform flexible string. Ac-
cording to Sec. 1 the general solution of (3-1) is
u(x,t) = fi(x - at) + h(x + at) (3-2)
where f\ and /2 are arbitrary twice-differentiable functions. 1 We shall
1 Actually, (3-2) is meaningful whenever f\ and ft are well defined, and hence, condi-
tions of differentiability are not emphasized in the sequel. A nondifferentiable function
(such as the function shown in Fig. 4) is regarded as being a “solution” of (3-1) if it can
be approximated, with arbitrary precision, by smooth solutions. See: I. G. Petrovsky,
“Partial Differential Equations,” p. 05, Cambridge University Press, New York, 1954.
PARTIAL DIFFERENTIAL EQUATIONS
[CHAP. 6
now see that these functions can be determined from the initial conditions,
that is, from the conditions at time t ~ 0. It is convenient to regard the
string as infinite and the conditions as given for — « < a: < », The
effect of the end points x = Q and x — l will be considered in Sec. 5.
Fxa. 4. Ordinates on the resultant wave are obtained by forming one-half the sum
of the oppositely moving waves shown by the dashed lines.
Case I. Initial Impulse 0. Assume that the string is released from rest
and that the initial shape is given by a known function /(x). (Such a sit-
uation arises when the string is plucked, as in a harpsichord.) In symbols,
u(x, 0) = f(x), u t (x,0) = 0, (3-3)
where the second Eq. (3-3) expresses the fact that the vertical velocity
du/dt is initially 0 for each point x of the string. By (3-2) we get
u t (x,i) = —afiix — at) + af 2 (x + at) (3-4)
upon using the chain rule as in Sec. 1. Since u t (x, 0) = 0, Eq. (3-4) gives
/»(*)-/«(*)
after dividing by a. It follows that / 2 (a:) = fi(x) + c, where c is constant.
SBC. 3] THE VIBRATING STRING 437
Using this equality with x replaced by x + at we see that (3-2) may be
written
u(x, t) « fi(x — at) +fi(x + at) + c . (3-5)
This step is sometimes puzzling when encountered for the first time; namely from
fc(x) « fi(x) 4* c
how can we deduce fz(x 4- at) * /i(x 4- at) 4- c? The conclusion follows because the
first equation holds for all values of x (and the conclusion would not follow otherwise).
One cannot simply set x «* x 4- at, because that would lead to ai «* 0. But one can
reason as follows: We have fz(x) «= fi(x) 4* c for all x. Hence fz(s) « fi(s) 4* c for all
8 , and the choice s — $ 4 - at yields the desired result.
So far we have used only the second initial condition (3-3). To ensure
the first condition, u{x, 0) = f{x), we set t = 0 in (3-5) and equate the result
to f(x), thus:
fi(x)+fi(x) + c~f(x).
It follows that /i(x) = Hf(x) — Yic } and substituting into (3-5) gives
the final answer:
u(x,t) « Yf(x — at) + Yf(x + at), (3-6)
The displacement v(x,t) in (3-6) is the sum of two waves, each of the
form }if(x) y which travel in opposite directions with the velocity a . In-
itially (that is, for t =* 0) these waves coincide, but with the passage of
time they diverge, the wave f(x — at) moving to the right and the other
to the left. In particular, if the waves are of finite extent, then any given
point of the string is at rest in the initial position after the passage of both
waves. The situation is illustrated schematically in Fig. 4 when f(x) is a
triangular wave on (—k,k).
Casb; II. Initial Displacement 0. Suppose, next, that the initial dis-
placement is 0 but that the initial velocity is not 0. (Such a situation
arises when the string is struck, as in a piano.) If the initial velocity is
g(x) at point x of the string, the initial conditions are now
u(x,Q) * 0, u t (x f 0) « g(x). (3-7)
The first Eq. (3-7) gives
fi(z) +/ 2 0r) = 0 ,
when we recall (3-2), so that /ate) = —fi(x) for all values of x. Using
this equality with x replaced by x + at y we see that (3-2) may be written
u(x y t) * f x (z - at) - fi(x + at). (3-8)
Differentiating (3-8) with respect to t and setting t » 0 yield
u,(x,0) = -afi(,x) - of i(x) - g(x)
438 PARTIAL DIFFERENTIA L EQUATIONS [CHAP. 6
when we use the second condition (3-7). It follows that
fi(x) ~ (*g(8)ds + c, ( 3 - 8 )
2a
where c is constant, and hence (3-8) gives the final answer
1 rx—at 1 rx+at
u{x,t) =-— g(s) ds + ~ 0(8) ds,
2 a J o 2a J o
The result may be expressed more compactly as
1 rx + at
u(x,t) = — / g(s) ds. (3-10)
2 a
Equation (3-8), like (3-0), represents a superposition of two waves
traveling in opposite directions. Here, however, the shapes of the waves
are determined by fi(x) and —fx(r),
which are mirror images of each
other in the x axis. Moreover, the
shapes are not found directly by the
initial condition but are obtained
through the integration (3-9). For
this reason the waves may be of infi-
nite extent even when the initial
impulse u t {x, 0) = g(x) is confined to
a finite portion —k < x < k of the
string. Indeed, for such a choice of
g(x) formula (3-10) shows that any
given point x of the string eventually
suffers a permanent displacement
“ l\g(s)ds . (3-11)
2a J ~ k
This is the case because when
at > k + 1 x | , the interval (x — at,
x -f at) contains the interval (—&,/:).
Inasmuch as g(x) = 0 outside the
interval (—&,&), the integral (3-10)
is then equal to (3-11). Since each
given point of the string eventually
moves the same distance (3-11), the
part of the string that is again at rest forms a straight line parallel to the
Original string. It is most interesting that this happens regardless of the
choice of g(x), provided only that g{x) « 0 outside some finite interval.
Graphical illustration is given in Fig. 5 for the case g(x) =* I on (—k,k).
i
~k
k
FTTT
[fTn
*
— 1
^
Fig. 5
THE VIBRATING STRING
439
SEC. 3]
Case III. Arbitrary Initial C<fnditions. Suppose, now, that both the
initial displacement and the initial velocity are given by arbitrary functions
of Xf so that the initial conditions are
u(xfl) = /(x), u t (xfl) * g(x). (3-12)
This problem can be solved by superposition of the two solutions previously
obtained. Indeed, let v(x } t) and w(x y t) satisfy the wave equation (3-1)
and the respective initial conditions
*(»,0) = f(x), v t (x, 0) = 0,
w(x,0) « 0, WifoO) = g(x).
Then the function
u(x y t) = v(x,t) + w(x,t)
satisfies the wave equation because v and w do, and addition of the relations
(3-13) shows that v satisfies (3-12). Since the wave equation was solved
in the previous discussion subject to initial conditions of the type (3-13),
addition of the two solutions obtained formerly gives the solution desired
now. That is,
1 l 1 rz+at
u(x,t) = -f(x - at) + -/(x + at) + — / g(s) ds . (3-15)
2 2 2 a
The expression (3-15) is known as d'Alembert's formula; it satisfies (3-1)
and (3-12), hence gives the motion of a string subjected to arbitrary initial
displacements and velocities.
The formula (3-15) can also be used to find the displacement of a semi-
infinite string (0 < x < <») fixed at x = 0. If the initial displacement and
velocity of a semi-infinite string are
w(x,0) « /(x), wt(x,0) = g(x), x > 0, (3-16)
we can imagine an infinite string for which the initial conditions in the
interval (0,«>) coincide with (3-16) and in the interval ( — «,()) are deter-
mined by
u(x,0) * -/(|x|), u*(x, 0) = -0(|x|), x < 0. (3-17)
The point x = 0 of an infinite string, moving in accord with (3-16) and
(3-17), will obviously be at rest, and the behavior of the infinite string for
x > 0 will be identical with that of the semi-infinite string.
The superposition method does yield a solution of the problem but does not establish
the uniqueness of that solution* We shall now show that every solution of (3-1) and
(3-12) can bo represented in the form t; + w t with v and w as in (3-13). Since v and w
were already shown to be unique, it will follow that u is also unique.
(3-13)
(3-14)
440 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
Indeed, let u(x,t) be a solution of the wave equation (3*1) which Satisfies the initial
conditions (3-12). Let v(x,t ) be the unique solution of (3-1) satisfying the first condi-
tions (3-13). Then the function w(x } t) defined by
w{x t l) a* U(x,l) — v(z,t)
satisfies (3-1) and the second set of initial conditions (3-13), as the reader can verify.
It follows that w is uniquely determined and hence u(x,t) is also uniquely determined.
Because of uniqueness, (3-15) describes the behavior of the string Without uniqueness,
we could only say that (3-15) describes a possible behavior of the string*
PROBLEMS
1. The displacement of a string is given by the traveling wave
u(x t t) ** sin (x — at).
What are the initial displacement and velocity? Verify, by actual substitution into
(3-15), that your initial values yield the correct result, u(x,t) » sin (x — at).
%, For a freely vibrating string the initial displacement and velocity are, respectively,
sin x and cos 2x. Find the displacement and velocity of the point x ** 0 when l ■*
Nmt: First find u( x t f) from (3-15).
3. A freely vibrating string was subjected to an initial displacement 6 cos 5x and
initial velocity 0. One second later it is found that the point x ■» 0 is displaced three
units from the equilibrium position; that is, u(0,l) = 3. What can you say about the
velocity of propagation for waves on this string?
4. The initial velocity of a freely vibrating string is arc""* 2 . For what choice of the
initial displacement (if any) does the resulting motion represent a traveling wave travel-
ing in the positive x direction? Hint: It is desired that u(x,t) «* f\(x — at). Determine
/i from the initial velocity, and then determine the initial displacement from j\.
6. Solve Prob. 4 with the words “velocity'* and “displacement" interchanged.
6. A stretched infinite string is struck so that its segment —1 < x < 1 is given an
initial velocity 1. Use (3-15) to find the displacement and sketch the displacement
curves for t ** 1/a and i *= 2/a.
7. The initial displacement and velocity of a semi-infinite string are u(zfl) sin x,
u t (x f 0) » 0, 0 < x < ». Find u(: c,t) for t > 0. Also find u{x t t) if u(x, 0) — 0, u*(x,0) =*
—2a cos x, 0 < x < oo.
4. Characteristics. A physical interpretation may be given not only by
plotting u(x,t) versus x for a succession of values of t, but also by consider-
ing the xt plane. Each point of the xt plane represents a definite position
on the string at a definite time t. If we take £ = 0 to be the present time,
then the half planevS t < 0 and t > 0 give the past 1 and future, respectively.
Since the speed of propagation is a, the disturbance at (x,t) will reach
a point (x 0 ,*o) given by
1 Although it is not appropriate to permit t < 0 when the string is plucked or struck
at t ** 0, it is appropriate if the string has been in motion for some time and the initial
conditions are determined by high-speed photography. We could then take the view-
point that we are trying to ascertain the past history of the string by observations on
the present.
SBC. 4]
THE VIBRATING STRING
441
X — Xq X —
— a or « —a,
t- to t- to
( 4 - 1 )
for the direct wave f\(x — at) and the opposite wave f 2 (x + at), respec-
tively. Equations (4-1) may be written
x ~ at * Xq — af 0 , x + ~ x 0 + afc- (4-2)
If we draw the two lines (4-2) through the point (xofy), as shown in Fig. 6 #
their intersection with the x axis
(that is, t - 0) gives those points on
the string for which the initial con-
dition contributes to the disturb-
ance at (xo,k)). The lines (4-2) are
called the characteristics of the par-
tial differential equation (3-1).
Along the first line (4-2), x — at
is constant, and hence f } (x — at) is
constant. Thus, the deflection due
to the direct wave is the same at all points of the first characteristic (4-2).
The second line serves the same purpose for the opposite wave, and we
can say, briefly, that the disturbance travels along the characteristics.
If the initial disturbance is confined to some interval (x lf x 2 ), then we
have the situation shown in Fig. 7. The xt plane is divided by the charac-
teristics into six regions. In region 1 the points receive the disturbance
from both waves, in II only from the opposite wave, and in III only from
the direct wave. The points in IV and V are too far away to receive any
disturbance at the corresponding times, and the points in VI are at rest
because both waves have passed. That is, if P is a point in the region
VI, then the characteristics through P (shown dashed in the figure) in-
tersect the x axis outside the interval (xi,x 2 ). Hence the initial displace-
ment at these points is zero, and we need consider the initial impulse only.
442
PARTIAL DIFFERENTIAL EQUATIONS
(CHAP. 6
Since the characteristics intersect outside the interval (ri,^), the dis-
placement at P due to the initial impulse is given by the constant value
(3-11).
We have seen that the initial conditions determine both the direct wave
and the opposite wave at each point on the x axis where these condi lions
are given. Since the disturbance propagates along the characteristics
the following theorem is suggested :
Theorem I. Let u and u t be given on the interval (ti,x 2 ) in Fig . 8, and
suppose Utt = a 2 u XJ . Then u(x,l) is
uniquely determined in the shaded
region but is ?iot uniquely determined
at any other point.
Both the initial displacement and
the initial velocity have to be speci-
fied in Theorem I, just as one would
expect intuitively. It is a remark-
able fact that the displacement alone
(without the velocity) will deter-
mine the solution, provided this dis-
placement is given along tw f o intersecting characteristics in the xt plane.
Indeed, let u(x,t) be given along (x u P) in Fig 8. Since the direct w r ave
fi(x — at) is constant on (x\,P) 9 we can ascertain the shape of the reverse
wave f 2 (x + at) along (jti,P). This, in turn, gives f 2 (x + at) along (x u x 2 ) t
because the disturbance f 2 (x + at) propagates from (x lr r 2 ) to (xj,P) along
the characteristics parallel to (x 2 ,P) (see the dashed line in the figure).
In just the same way, w r hen u(r,t) is given on (x 2f P), we can determine
the shape of the direct wave f\ (x — at) on (xi,x 2 )- Thus, we are led to the
following theorem:
Theorem II. Let u be specified along the two intersecting characteristics
(xi,P) and (x 2l P) in Fig . 8, and suppose that u fi = d z u xx . Then u(x,t) is
uniquely determined in the shaded region but is not uniquely determined at
any other point.
Theorems I and II are the fundamental existence and uniqueness theorem
for the wave equation, deduced here by physical considerations. A simple
mathematical proof of the same results is given in See. 25.
Fig. 8
6* Boundary Conditions. We now suppose that the freely vibrating
string is not infinite but is stretched between two points of support (Fig.
3). When the supports are on the x axis and do not move, the situation
is described by
u(0,t) « 0, u(l,t) * 0 for all t (5-1)
These are called boundary conditions , because they refer to the boundary
points of the interval (0,1) in which our physical problem is defined. Al-
rtKiriAnolv Hr* rmf mnfmn
THE VIBRATING STRING
443
SEC. 5]
uniquely, they do enable us to establish some of the most interesting and
important properties of the motion. Hence, in this section we see what
can be deduced from (5-1) alone. In the next section we use (5-1) together
with appropriate initial conditions.
Physically, one would expect the
until the disturbance created by the
ends reaches the point of observa-
tion. In terms of Fig. 7, the ends
x = 0 and x = l have no effect in
the region I provided the points
x = 0 and x = / lie outside the
interval (xj^r 2 )- When the disturb-
ance reaches an end point, however,
it is reflected, and the reflected
wave must eventually be taken into
account.
Because the end point is fixed the
incident and reflected waves have
algebraic sum 0 at the end point,
and hence there is a 180° phase shift.
A wave of type / 2 (.r + at) becomes
a w^ave of type —/ 2 ( — .r + at) upon
reflection at x = 0, for example (see
Fig. 9). The change of sign m / 2
expresses the phase shift, and the
change of sign in x indicates that
the reflected wave
g(x - at) s -f 2 (~x + at)
string to act like an infinite string
Fig U
propagates in the opposite direction.
When the wave is reflected again at x — l, we get another minus sign
in each case, and hence the original wave f 2 (x + at) is restored (Fig. 9).
Since the velocity is a and the length of the round-trip path is 2 1, the time
for a round trip is
21
Period of vibration — — • (5-2)
a
In terms of/ 2 (af) the periodicity condition means that
/ 2 (tt/) = / 2 [a j = f 2 (at + 21 ).
Similar remarks apply to /i(x), and hence we expect that both /i(x) and
/ 2 (x) will be periodic functions 1 with period 21
1 A function f(x) has period p if f(x -f p) m f(x), where p is a nonzero constant.
444
PARTIAL DIFFERENTIAL EQUATIONS
[CHAP. 6
To discuss the boundary conditions mathematically, let us think of
the finite string as being in reality an infinite string which vibrates in such
a way that the points x « 0 and x » l remain fixed. The formula
u(x,t) ® f\{x — at) + f 2 (x + at) (5-3)
holds for all solutions of the wave equation. Letting x =* 0 gives
0 * fi(-at) +/a(o0
when we use the first boundary condition u(0,t) = 0. This shows that
f 2 (s) « — /i( — s) for all s, and hence (5-3) becomes
u(x,t) = fi(x — at) - at). (5-4)
Thus the effect of the boundary condition at x = 0 is to reduce the number
of arbitrary functions from two to one.
The second boundary condition applied to (5-4) gives
0 — /i(-i — a€)
or, if we set s = —l — at,
o «/t(« + 20 -/i«. (5-5)
Since t is arbitrary, so is s, and hence f\ (x) has period 21. (This agrees
with the surmise we had formed on physical grounds.) In view of (5-4),
we can summarize our result as follows:
Theorem I. Suppose an infinite string vibrates freely in such a way that
the points x * 0 and x ~ l remain fixed . Then the displacement u{x,i) is
periodic both in space and in time. The two periods are , respectively , 21
for x and 21 /a for t if a is the velocity of propagation.
Hence if a string is stretched between two fixed points, the free vibrations
are periodic no matter what the initial conditions may be. Since a periodic
vibration is generally perceived as musical, this fact is of great importance
for the development of musical instruments.
Theorem I asserts that the motion will repeat after a time 2 1 fa. Hence
if the minimum t period of a vibrating string is determined by observation,
that minimum period will not be longer than 21/ a. It may be shorter,
however. For instance, the function
u(x,t) = sin 2 t rx/l cos 2r at /l
satisfies (5-1) and the wave equation, hence represents free vibrations of a
string of length l. But the minimum t period of this function is l/a rather
than 2 l/a. The shorter period is explained by the fact that uifi/ilfy as 0;
that is, the center of the string is a node. The center does not move,
and the string acts like two strings of length 1/2 placed end to end. We
shall now show that there is always at least one node if the period is smaller
than that given by Theorem I.
SEC, 5] THE VIBRATING STRING 445
Theorem II. If the string considered in Theorem I has an x period 2 p
or t period 2p/a } where 0 < p < l, then the point x =* p must be a node .
Suppose, first, that the x period is 2 p. Then, in particular,
u(p,t) « u(-p,t).
On the other hand (5-4) gives u(x,l) «• — u(— x,l); hence
(M)
«(p,t) - -u(-p,t).
(6-7)
By addition of (5-6) and (5-7) we get u(p,t) « 0, which shows that x « p is a node.
Suppose, next, that the t period is 2 p/a. The equation
combines with (5-4) to give
/i(x - at - 2 p) -at - 2p) « /i(x - at) - at).
If we let x 4 * at * 0 and x — at — 2p ** 8 , the equation reduces, after rearrangement,
to
Ms 4- 2 p) - Ms) » c, (5-8)
where r = /i(0) — f\{ — 2p) is constant. Equation (5-8) shows that /i(s) increases by
the amount c whenever s increases by 2p. If c 0, it follows that i/i(s) j is unbounded.
However, fi(s) has period 2/ by (5-5), hence is bounded, and this shows that c « 0 in
(5-8). The choice s =* —p — at in (5-8) with c =* 0 leads to the desired result:
n(p,0 ** MV ~ at) - M~P ~ at) ** 0.
To illustrate the use of Theorem II, suppose a 2-in. -diameter steel cable
100 ft long is observed to vibrate without nodes at lino rate of two complete
cycles per second. According to Theorem I the t period is 2 1/a or possibly
less. But Theorem II shows that the period is not less, since the motion
was observed to have no fixed points. Hence
1 100
- = 2 —
2 a
which gives a - 400 fps. This is the velocity with which waves are
propagated along the cable. Since the density of steel is about 480 lb
per ft 3 , the weight of l ft of cable is
480tt(M'2) 2 1 - ( l %)* - 101b per ft.
This gives p — x %2 slug for the linear density, and hence the tension is
T « a 2 p - (400) 2 ( 1 5>32) = 50,000 lb.
PROBLEMS
1. An infinite string vibrates freely in such a way that the two points x •* 0 and
x m l remain fixed; that is, u(0,0 « u(l f t ) * 0. Are any other points of the string neces-
sarily fixed? Which ones?
446 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
% Suppose a freely vibrating string of length l has just one node at x ** p between
0 and L Show that the node must be at the mid-point. (The analogous result for n
nodes is also true.) Hint: If p < 1/2, apply Prob. 1 to the two points x ** 0, x *» p.
If p > 1/2 , apply Prob. 1 to the two points x ■» p, x «■ l
8. A cable of length l ft is made of a material with density d lb per ft 8 . It is found
that the cable makes 10 complete oscillations in t sec. Show that the cross-sectional
stress is
a » 0.087ri psi
provided the oscillations do not have a node between 0 and i IIow w ould the result
change if the mid-point remains fixed during the observed oscillations but no other
point remains fixed?
4. Let h(l) be a given function of t. (a) What is the physical meaning of the boundary
condition u(l,l ) * h(t)t (b) Describe a physical problem that would lead to the boundary
conditions u(x,0) * 0, u[x,h(t)\ ** 0.
6. Initial and Boundary Conditions. We shall now consider the free
vibrations of a string satisfying the boundary conditions
v(0 fi = 0, uQ.fi — 0 for all t, (0-3)
together with the initial conditions
w(x,0) = /Or), Ui(x, 0) = g(x) for 0 < x < 1. (G-2)
As in the preceding section we regard the finite string as being an infinite
string with nodes x ~ 0, x — L According to (5-4) and (5-5), the boundary
conditions give
v(: rfi = fi (x — at) — fi(—x — at) (6-3)
where /x(x) has period 21, and conversely, (G-3) ensures (0-1). The initial
conditions (0-2) are prescribed on (0,/) for the infinite string, and our task
is to assign initial conditions outside the interval (0,/) in such a way that
the solution has the form (0-3).
Denoting the unknown initial conditions for the infinite string by fo(x)
and go(x), we have
/oW = /(*), go(fi = ffW, 0 < x < l, (6-4)
because the infinite string is to agree with the finite string on (0,1). Upon
setting t « 0 in (6-3) we get
fob) */i(*) ~fi(-x).
Similarly, differentiating (6-3) with respect to t and putting t = 0 give
Qob) * -~af } (x) + afi(-~$).
These expressions show that 1
1 A function tf>(x) wen if <K—x) m 4>(x), odd if <f>(x) m — <f>(x). An analytical and
graphical discussion of such functions is given in Chap. 2, Sec. 19.
SEC. 6]
THE VIBRATING STRING
447
( 6 - 5 )
/o(x) and go(x) are odd functions.
Hence, / 0 and go are determined on ( — 1,1) by their values on (0 ,1).
Finally, since fo(x) and go(x) are expressed in terms of the function fi (x),
which has period 21, we see that
/o(x) and g 0 (x) have period 21. (6-6)
Thus, /o and go are known everywhere as soon as they are known on ( — l, l ).
According to (3-15), the solution is
111 rx+at
u(x,t) * -fo(x - at) + ~/o(x + at) + — / g 0 (s) ds. (6-7)
2 2 2a
If /o(x) and g 0 (x) in (6-7) are determined by (6-4) to (6-6), it is easily
verified that this function u{x,t) satisfies the wave equation, the initial
conditions (6-2), and the boundary conditions (6-1). Thus, (6-7) is a
simple and explicit expression for the motion of a vibrating string with
fixed end points.
The correspondence between the finite string and the infinite string leads to an
interesting geometrical construction for getting the disturbance at any point P of the
strip 0 < t < / in the xt plane (Fig. 10). For the infinite string the disturbance at P is
found by drawing characteristics as in Sec. 4 (see solid lines in Fig 10). Since the initial
conditions for the infinite string are obtained from those for the finite string by (6-4) to
(6-6), the same result may be found by following the dashed lines in Fig. 10. To take
account of (6-5), however, we must introduce a changed sign upon each reflection at the
boundary. The disturbance at P arises from the initial disturbance at x{ and * 2 , sub-
ject to the above-mentioned convention regarding sign. This reflection of the charac-
teristics in the boundary lines x « 0, x ** / is quite analogous to the reflection of waves
at the end points of the string.
44$
PARTIAL DIFFERENTIAL EQUATIONS
[CHAP. 6
The procedure illustrated in Fig. 10 is an example of the method of images, so called
because the initial conditions for the infinite string are obtained from those for the
finite string by forming repeated mirror images in the lines t «* 0, x « 0, and x ■* L
Example: Discuss the free oscillations of a string of length l which satisfies the initial
conditions
7l1tX
w(x,0) * f n sin — - — » tq(x,0) 388 0,
where « is an integer and /« is constant.
Since sin nvx/l is odd and has period 21, we may take
/o(x) « f n sin g 0 (x) ® 0
as initial conditions for the associated infinite string. Equation (6-7) now yields the
solution
u(x,t)
fn . nir(x — at) f n nr(x -f- at)
■ 2 sm — 7 + 2 sm — 7
mrx nired
® /* sin — cos ■— j~
( 6 - 8 )
If the initial displacement is given by a Fourier series, so that 1
u(x,0) «= /(x) ® 2/ B sin — > w,(x,0) - 0
then superposition of the corresponding solutions (6-8) yields
u{x,t)
. Wirx 7} -Kill
S/ n sm — cos — —
(6-9)
Similarly, by choosing /o(x) » 0, yo(x) » pn sin n?rx/f in (6-7), the reader can verify
that the solution satisfying
7llTX
u(x,0) « 0, a<(x,0) « g(r) « 2gr„ sin •
/
/ v Q’J • Hirjr • n7rai
u(x,t ) * 2* sm — - sm — *
nira l l
(6-10)
Superposition of (6-9) and (6-10) yields the general Fourier-series solution of the wave
equation satisfying (6-1) and (6-2). The result can be expressed explicitly in terms of
f{z) and g{x) by means of the Euler-Fourier formulas
z 2 a \ ' nTX a 2 (' r \ * nirx 7
fn « y J f{x) sm — dx, g n * y J g(x) sm — dx.
Because of convergence questions the Fourier-series solution is somewhat less general
than (6-7), and it is hopelessly inferior to (6-7) for numerical computation. But Fourier
aeries have great usefulness in that they apply to many problems in which the preceding
oo
1 Throughout Chap. 6 we use 2 as an abbreviation for - A brief review of this
sigma natation is given in Chap. 2, Sec. 2, and Fourier series are discussed in Chap. 2,
Secs. 18 to 25.
BBC. 7] THE VIBRATING STRING 449
methods fail. Examples are given for the vibrating string in the next section and for
other physical systems in the sections to follow.
PROBLEMS
1. Show that the expression (6-8) satisfies the appropriate (a) differential equation,
(b) initial conditions, (r) boundary conditions.
2 . The initial displacement of a freely vibrating string of length l is
„ x 2bx n 1,
fix) - — > 0 <x <-l,
fix) m 2b - < X < l t
and the initial velocity is */(x) * 0. (a) Sketch fix) and foix). (6) Using (6-7) and your
sketch, find the displacement of the mid-point of the string when at «* 1/4.
3. (a) Express fix) in Prob. 2 as a Fourier sine series. (6) By (a) and (6-9) show that
the displacement of the string in Prob. 2 is
uir,t)
86/1 ttx Tat 1 . Sir x 3 vat
(r* s,u T 008 t - p a,n T 008 ~r + • • •
(c) Obtain an infinite-series representation for the displacement of the mid-point when
at « 1/4 .
7. Damped Oscillations. The foregoing discussion was concerned with
free vibrations, so that F(x r t) == 0 in (2-3). It was indicated that the
displacement u(x,t) is always periodic in time and hence the amplitude
remains constant. But. in fact, the oscillations gradually die down when
a string is vibrating in air, and this behavior is to be analyzed next.
The reason for the decrease in amplitude is that the air resists the
motion of an object moving through it. When there is no relative velocity,
there is no resistance; when there is high velocity, there is high resistance.
If the resistance is assumed proportional to the velocity, we have
F(x,t) » —2 bu t (x,t), b > 0, const, (7-1)
in (2-3). The minus sign is used because the force resists the motion,
hence is directed opposite to the velocity. Our partial differential equation
is now
u it — a 2 u xx » —2 bu t (7-2)
and the solutions of (7-2) for b > 0 represent the damped oscillations of
the string. As before, one has the initial and boundary conditions
u(x, 0) « f(x), u t (xfl) « g(x), (7-3)
u( 0,0 - 0, u(l,t) - 0. (7-4)
Equation (7-2) cannot be solved by the method of the preceding sections
but can be solved by Fourier series. Thus, since the solution u(x,t) is a
460 PARTIAL DIFFERENTIAL EQUATIONS (CHAP. 6
twice-differentiable function of x for each t, we may expand u(x,t) in a
Fourier sine series
nirx
u(x,t) =* 2 b n (t) sin * 0 < x < l. (7-5)
l
A sine series is chosen rather than a cosine series because such a series
automatically satisfies the boundary conditions (7-4). To satisfy the
initial conditions we require
nirx , nirx
f{x) « 26 n ( 0) sin — . gix) - 2^(0) sin — • (7-6)
These relations show that i>«(0) and 6*(0) must be the Fourier coefficients
of fix) and gix). That is, if
2 ri htx 2 ft nirx
/n « 7 / /Or) si n — dr, g n * 7 / flf(x) sin— -dr (7-7)
l y o l l Jo l
then multiplying (7-6) by sin nTrrr/Z and integrating from 0 to l yield
MO) - /«, b' n i 0) = 0n. (7-8)
We must still satisfy the differential equation. Upon substituting the
terms (7-5) into (7-2) we get
k 2
26" sin — + a 2 26 ;
l
■(t)
T
— 2625 n sin -
nirx
T'
which gives a set of ordinary differential equations
„ , / nira \ 2
b n + 2 bb n + y~J~J b n = 0
(7-9)
when the coefficient of sin nirx/l is equated to zero.
Equation (7-9) may be solved as in Chap. 1 by assuming that b n = e at .
It is found that
bnit) = C 0 e~ bi cos u n t + Cie~~ bt sin a? n t, (7-10)
where
(7-11)
The arbitrary constants C 0 and C 1 are determined from (7-8) as
Co = fny Cl = (gf n + bf n )03~ l .
Substituting (7-10) into (7-5) yields the final answer
[fn
u{x f t) m e 6< 2 fn COS 03 n t + (fif n + bf n )
sin u> n l
w«
nirx
sin
J l
THE VIBRATING STRING
451
SEC. 8]
Conditions (7-2)~(7-4) are satisfied if the term-by-term differentiation is legitimate,
for instance, if and g”{x) are bounded. 1 When b « 0, the solution agrees with
the sum of (6-9) and (6-10), as it should. According to (7-11), the damping reduces the
frequency of the corresponding terms in the series for undamped vibrations. If 6 < ro/f,
all the terms are oscillatory and they have the same damping factor e~ bi . But for larger
values of b the first few harms may have pure imaginary. The corresponding trigo-
nometric functions become hyperbolic functions, and the terms in question are not
oscillatory. If w w « 0, which may happen in this latter case, we replace (sin w n Q/wn by
its limit t [cf. Chap. 1, Sec. 32].
PROBLEMS
1. A string of length l vibrating in air satisfies the initial conditions u(x, 0)
» fi sin tx/1 , u t (x, 0) ** 0. Show that the displacement of the mid-point can be written
in the form
u(l^l,t) ** Ae~~ bt cos (wl -f 4>) t A, eo, <#> const.
2. Referring to Prob. 1, sketch the curves y « ±Ae~ bt and y « in a single
neat diagram. Thus describe an experimental procedure for determining b. (When
the oscillations are rapid and b is small, one can speak of the mean amplitude at a given
time t. If the amplitude is A o at time to and A \ at time ta r, the reader can verify
that
Since Ao and A\ can be found by placing a scale behind the oscillating string, this gives
a method for comparing the viscosity of gases.)
8. Forced Oscillations and Resonance. Sometimes the force function
F(x,l) does not involve the unknown displacement u , as in (7-1), but is
determined independently. (For example, consider the gravitational force
on a horizontal vibrating string.) The corresponding mathematical
problem is
u tt - a 2 u xx - F(x,t), u(x, 0) = fix), u,(x, 0) = g(x),
u(0,t) = 0, u(l,t) = 0. (8-1)
Associated with this problem are two simpler problems,
v tt ~ a 2 v xx = F(x,t ), v(xfi) = 0, vt(x,0) = 0,
= 0,
v(l,t) = 0,
(8-2)
and
w u - a 2 w xz = 0,
wi(.v,0) = f(x), w t (x, 0) = g(x),
3
II
o
w(l,t) = 0.
(8-3)
Equation (8-2) describes purely forced vibrations, and (8-3) describes free
vibrations. Now, if v satisfies (8-2) and w satisfies (8-3), it is easily seen
1 See Chap. 2, Sec. 26, Theorem III.
PARTIAL DIFFERENTIAL EQUATIONS
452
[chap. 6
that tt *■ v + v> satisfies (8-1). Also, uniqueness in the latter problem
yields uniqueness in the former. Since (8-3) was solved in Sec. 6, we need
consider (8-2) only. This system will now be solved formally on the
assumption that F(x,t) has a Fourier series,
F(x,t) = 2B n (<) sin (8-4)
c
The coefficients are given by the Euier-Fourier formulas,
2 ri mr£
B n(t) -y JT F({,0sin-y^ (8-5)
Substituting (8-4) and the Fourier series
MTX
u(x,t) * 2b n (t) sin ~j~ (8-6)
into the differential equation (8-1) gives
„ nirx 0 mrx nwx
X6 n sin h 2w*b n sin 2 B n sin »
Z II
where o>» « nva/l [compare (7-11)]. If we equate the coefficients of
sin nrz/l , we get
K + J n b n = B n . (8-7)
These equations are to be solved subject to the initial conditions
M 0) - 0, b' n { 0) - 0, (8-8)
which result from the initial conditions in (8-2). By the method of Chap.
1, Sec. 28 [cf. also Eq. (33-9) in Chap. 1], the solution of (8-7) and (8-8) is
Kit) * 0>n l j 0 #n(X) sin Unit - X) dX. (8-9)
Determining B„(X ) by (8-5), b n (t) by (8-9), and u(x,t) by (8-6) yields an
explicit formula
s 2 nirx rt fi nir£ nra
u{x,t) » 2 sm — / / sm — sin ~~ (t ~ X)F( f,X) d£ dX
nwa i Jo Jo l i
when o? n is replaced by its value nira/L
If we have both damping and forcing, then (8-7) contains an extra term 2bb’ n as in
(7-9). This leads to a different formula (8-9), but in other respects the analysis is un-
, changed. Thus, the method of Fourier series enables us to find the damped oscillations
of a string with arbitrary initial conditions and force function.
THE VIBRATING STRING
453
SRC. 8}
If F(z,t) is periodic in t, there may be resonance, and this important
phenomenon will now be discussed for the soecial case (cf. Chap. 1, Sec. 33)
F(x } t) = a(x) sin cot + b(x) cos uL (8-10)
[In tlie general case F(z,t) is a sum of terms like (8-10), since the assumed
periodicity enables us to express F(x,t) as a Fourier series in L] With
F(xfy as in (8-10) the form of B n (t) can be determined by inspection of
(8-5). Substitution into (8-7) then gives an equation of form
b" n + w$6 n = a sin cot + 0 cos u>t, (8-11)
where a and 0 are constant.
If a? 2 c* 4 the solutions of (8-11) are all bounded, but if co « w B , the
particular integral involves the functions
t sin cot, t cos o)t
which increase indefinitely with t Hence in that case the term
riTX
b n (t) sin — - (8-12)
in the Fourier series for u(x,t) becomes strongly emphasized as t increases,
and we say, briefly, that the oscillation (8-12) is resonant.
A physical explanation is readily given in terms of the results of Sec. 5.
Thus, the condition o> = co n can be written as
2tt _ 21
co na
This asserts that the period of F(x,t) in (8-10) is equal to the period for
free oscillations of a string of length l/n. And l/n is precisely the distance
between nodes for a vibration of the type (8-12).
Example; A cord stretched between the fixed points x ■* 0 and x *=* l is initially sup-
ported so that it forms a horizontal straight line. Discuss the oscillations when the
support is suddenly removed.
The force function F\(x,t) in Sec. 2 is —gp, and hence the partial differential equation
is
Uti — — g , (8-13)
while the boundary and initial conditions are
u(0,t) » ** 0,
u(x f 0) « u<(£,0) m 0. (8-14)
If wo succeed in finding a particular solution u « v(x) of (8-13) which satisfies the
boundary conditions
v(0) m v(l) - 0,
(8-15)
454 PARTIAL DIFFERENTIAL EQUATIONS
then the solution of (8-13) can be written as
[CHAP. 6
u(z,t) « w(x t t) + v(x) (8-16)
where, an follows from (8-13) and (8-14), w(x,t) satisfies
Wit - oVtx = o, w(0,t) m 0, w(l,t) m> 0, U>(«,0) » — t>(x),
Wt(xfi) « 0. (8-17)
Since the desired particular solution v(x) is to be independent of t , the choice u(x,t ) » v(x)
in (8-13) yields aV' « g, so that
*>(*) « - ~ z), ( 8 - 18 )
when the integration constants are determined so as to satisfy (8-15). This particular
solution corresponds to the equilibrium position of the string under gravity. The solu-
tion of the system (8-17) can now be written down with the aid of (6-7) as
w{x f t) - Hfo(x -at) 4- 34/oOr 4* at),
where /o(x) is odd, has period 21, and is defined for 0 < x < l by
fo(x) - -K*) - x(t - x).
The required solution is u * v 4- w.
By interpreting /o(-r), fo(x — at), and fo(x 4“ at) graphically one finds that tr(j ,<) is
largest on 0 < x < l when at * 0, 2/, 41, and then w(x,t) « f 0 (x). Similarly, w{x,t)
is least when at *» l, 3/, 51, . . and then w(x f t) — —fo(x). It follows that the cord oscillates
between the horizontal position u » 0 and the position u = 2v{x) in which each point
is twice as low as the equilibrium position (8-18). The period is 21/ a.
PROBLEMS
L A horizontal cable 100 ft long sags 5 ft when at rest under gravity. If the cable
is disturbed so that it oscillates without nodes, what is the frequency of the oscillations?
Hint: See (8-18).
8. A string of length / is subjected to a force F{x,i) ** sin wt sin vx/l, where o> is
constant. Find the displacement u(x,t) if the string was initially at rest in the equilib-
rium position. Be sure to distinguish the cases w nra/l and « «* nxa/l.
3 . Show that the equilibrium shape of a string under a force F(x) is described by
“-Sjfjf jO?' *>**•
4 . Show that the function
v(x,i)
ds dr
satisfies v u — a 2 v xx « F(x,t). Hint: Let x 4 at « r, x — at » s, v(x,t) « V(r,s). Then,
As in Sec, 1, —4 a 2 V r , » F(x,t).
8. If v(x,t) is the function obtained in Prob. 4, let w{x,l) be determined by
Wtt — a 2 w xx ** 6, w(x,0) » f(x) — v(x,0), w*(x,0) *» g(x) — v t {xfi) t
455
SEC. 9] SOLUTION BY SERIES
(cf. Sec. 3). Then u — v -f w satisfies
u tt - a*u zz » F(x,t), u(s,0) » /(x), w t (x,0) - g(x).
6. Solve the Example in the text by means of Fourier series.
SOLUTION BY SERIES
9. Heat Flow in One Dimension. The foregoing discussion of the
vibrating string enabled us to survey the field of partial differential equa-
tions and to illustrate a number of important methods. Prominent among
these is the method of infinite series, which will now be explored more
fully and used in a variety of applications. We begin with a problem from
the theory of heat conduction.
Consider a section cut from an insulated, uniform bar by two parallel
planes Ax units apart (Fig. 11), and suppose that the temperature of one
Temperature « w
u+Au
x«0
x x+Ax
x~l
Fig. 11
of the planes is u while that of the second plane is u + Aw. It is known
from experiment that heat flows from the plane at higher temperature Vo
that at the lower, the amount of heat flowing per unit area per second
being approximately
Aw
Rate of flow « — k (9-1)
Ax
Here A; is a constant called the thermal conductivity of the material; its
dimensions in the egs system are cal/(em-sec °C). In the limit as Ax 0,
Eq, (9-1) can be regarded as an exact equality, so that
Rate of flow = — ku x . (9-2)
On the other hand, if c is the heat capacity of the medium and p its
density, the amount of heat in the section from x to x + Ax is
(cpA A x)u, (9-3)
where A is the cross-sectional area and w here u is the mean value of u
over the interval (.r, x + Ax ). For a time interval (f, t + At) the increase
in amount of heat in the section (z, x + Ax) can be computed from (9-3)
and also from (9-2). The computation yields 1
1 It is supposed that no heat is generated within the material and that p f and c
are constant over the relevant range of temperatures. If p is measured in grams per
eubic centimeter, the dimensions of c are cal/(g °C).
466 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
cpA Ax Hix, t + At) — cpA Ax U{x } t)
* kA At ti-x(x *4* Ax, t) kA At ti x (Xyt)y
where u x is the mean value of u x in the time interval {t, t + At). Dividing
by cpA Ax At we obtain
U( x f t + AO — U(x f t) k u x (x + Ax, 0 ~ u x (Xjt)
At cp
and letting Ax — * 0, At — ► 0 now gives
XX- 1 “• CL Ujj f
Ax
.2 «
Cp 1
( 9 - 4 )
( 9 - 5 )
if we recall the definition of partial derivative.
The fact that (0-4) involves mean values causes no trouble when ut and u xx are con-
tinuous; see the discussion of Eq. (7-4), Chap. 5. Thus, (9-5) follows without approxima-
tion from appropriate physical assumptions. This contrasts to the ‘wave equation
Utt ** which is only an approximate statement of Newton’s law for the vibrating
string.
We shall now solve (9-5) under the assumption that the initial tem-
perature is a prescribed function /(x),
xi(x, 0) ~ /(x), 0 <x <1, (9-6)
which can be represented by a convergent Fourier series. The ends of
the bar are assumed to have the temperature zero:
u(0,t) * u(l,t) = 0 , t > 0 . (9-7)
Since u xx must exist if u satisfies (9-5), we know that u(x f t) has a Fourier
series in x for each fixed t > 0:
nirx
u(x,t) = 2 b n (t) sin (9-8)
l
Here, a sine series is chosen because such a series automatically satisfies
the requirement (9-7). Proceeding tentatively, assume that (9-8) can be
differentiated term by term to give
, nirx n / riT\ /
SbUOsin— - « 2 S6 n (0(y) (-
l )
( 9 - 9 )
upon substitution into (9-5). Equation (9-9) is satisfied if the coefficients
of sin nrx/l on each side are equated:
K
^arnr
2
I K.
457
SEC. 0] SOLUTION BY SERIES
Upon integration this gives
b n (t) « c n e-<° n ' ll)U ,
where the c„ are constant, and hence (9-8) becomes
u{x,t) = 2c n e-< an 'l» ' sin • (9-10)
l
The initial condition (9-6) yields
A nirx
f(x) = u(xfi) * 2c n sin — - • (9-11)
L
Since the Fourier series for/(x) converges to f(x) by hypothesis, Eq. (9-11)
is assured if c n are the Fourier coefficients,
c n
2 ri
T / /(*) sin
l Jo
rnrx
dx.
I
The only questionable step in the foregoing discussion was the term-by-term dif-
ferentiation, but this step can now be justified. Differentiating (9-10) term by term
actually does give
u xx
( nit
7
/
-lc„e- {a " lty “ (“y y
(9-1 11
because the series (9-12) are uniformly convergent when t > 6 > 0. (See Chap. 2, Sec. 7,
Theorem IV. The uniform convergence follows from the convergence of
V /{ 2 e -(anril) 2 d^
since the Fourier coefficients c„ are bounded.) Hence, (9-10) is a solution of the problem.
We cannot yet say that (9-10) is the solution, because there might be another solution—
necessarily different from the one we found - for which the *erm-by-t,erm differentiation
is not permissible. A uniqueness theorem is established, however, in Sec. 24.
Because of the exponential factors the series (9-10) is rapidly con-
vergent and affords a useful means of computing the temperature. By
contrast, the series obtained in Sec. 0 for solutions of the wave equation
converges no better than the series for the initial values f(x) and g(x).
The physical significance of this difference in the two cases is discussed in
Sec. 27.
Example 1. Find the steady-state temperature of a uniform bar.
It is required that u(x,f) be independent of t, whence by (9-5)
a*u xx ** ut ** 0 .
Hence, u *■ -h c\x, where co and c\ are constant. If the temperatures at the ends
458 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
are, respectively, uq and ui, we can determine the constants and thus obtain the formula
u(x,t) wo -f j (vi — uo). (9-13)
The rate of heat flow is given by (9-2) and (9-13) as
and hence, (9-1) holds without approximation in the steady state.
Example 2. A rod of length 5 has the end x *= 0 at 0°, the end x = 5 at 10 °, and the
initial temperature is f{x). Find the temperature distribution.
If v(x,t) is the unknown temperature at point x and time t f we let
u * v — 2x, (9-15)
where 2x is the steady-state temperature determined from (9-13). Then a 2 u xx =
u t) u(x, 0) » f(x) — 2x, u(0,l ) * m( 5,0 *» 0. Hence it is given by (9-10), whore the r n s
are the Fourier coefficients of f(x) — 2j. When we have found u, Eq, (9-15) gives v
We have noted that the value 2x introduc<‘d in (9-15) is the steady-state temperature
as determined by Example 1 The same method enables us to replace anv constant
boundary conditions by the homogeneous conditions u(0,/) = u(L,t) = 0 That is, if the
aflknown temperature v(x,t) satisfies
v(0,t) « % v(l,t) =* vi t v Q and v x const, (9-10)
we define u to be the difference between v and the steady-state temperature.
u{x,t) = v(x,t) - [> -b ~ (vi - Vo) J •
Then u(0,t) *■ u{l,t) * 0, and hence it can be determined by the method of the text.
A similar use of the steady-state solution was made in the Example, Sec. 8.
PROBLEMS
1. Compute the loss of heat per day per square meter of a large concrete wall whose
thickness is 25 cm if one face is kept at 0°C and the other at 30°C\ Use k = 0.(X)2,
and assume steady-state conditions. Hint. The wall can be thought to be composed
of bars 25 cm long perpendicular to the wall faces. By symmetry, no heat flows through
the sides of these bars in the steady state, and hence (9-14) ran be applied.
2 . An insulated metal rod 1 m long has its ends kept at 0°C and its initial temperature
is 50 °C. What is the temperature in the middle of the rod at any subsequent time?
Use k - 1.02, c - 0.06, and p * 9.6.
3 . Let the rod of Prob. 2 have one of its ends kept at 0°C and the other at 10 °C.
If the initial temperature of the rod is 50 °C, find the temperature of the rod at any
later time. Hint: See Example 2.
4 . An insulated bar with unit cross-sectional area has its ends kept at temperature
0, and the initial temperature is f(x) * c n sm nrx/l, where c n is constant and n is an
integer, (a) Show that the amount of heat present in the bar initially is 2lcpcjnv if
n is odd and 0 if n is even. ( b ) Show that the net rate of flow out of the bar across the
ends is 2kc n (nr/l)e^ {anrll)it when n is odd and 0 when n is even. Hint: The rate of
SOLUTION BY SERIES
459
SEC, 10 ]
flow out of the bar at the end x * 0 is *f ku Xi not —ku x . ( c ) How much heat flows out
of the bar in the time from t *» 0 to Ct Evaluate a s $ — ► «, compare (a), and explain.
5. By addition of the results in Prob. 4 obtain similar results for the bar with arbitrary
initial temperature /(i).
10. Other Boundary Conditions. Separation of Variables. In the fore-
going section the differential equation
Ut ~ CL lixx ( 10 - 1 )
was obtained for the temperature u{x,t) of an insulated bar at point x
and time t. The initial condition was
u(x, 0) * f(x ), 0 < x < l, (10-2)
and the ends were held at constant temperature.
If, instead, the ends are insulated , the boundary conditions are
uAQ,l) = 0, u x (l,t) « 0. (10-3)
Equations (10-3) are appropriate because by (9-2) they state that the
rate of flow across the ends is zero. We shall now consider the problem
posed by (10-l)-(10-3).
The boundary conditions (10-3) are satisfied automatically if we express
u(x,t ) as a cosine series:
1 71TTX
u(x y t) = ~a 0 (t) + Sa n (/) cos—* (10-4)
JL t
Thus, u x in (10-4) is a sine series (assuming that one can differentiate
term by term), and we have already noted that the sine series vanishes
at x ~ 0 and l.
Substituting (10-4) into the differential equation (10-1) gives
) a n, (10-5)
just as in the derivation of (9-9). Solving (10-5) and substituting into
(10-4), we find
1 , mrx
u(x } t) = - c 0 + 2c n e (rm * /7W cos * (10-6)
2 l
where the c n s are constant. The initial condition (10-2) shows that the c n s
are the Fourier cosine coefficients,
2 ri nrx
c„ *= - J o f(x) cos — dx, (10-7)
and the problem is solved. 1
1 The solution can be verified, if desired, as in the previous section.
1 ' n ' •if 7 '*
^ «o 0, (i n ot ^ ^
400
PARTIAL BIFFERENTIAL EQUATIONS
[CHAP. 0
We shall now solve this same problem by an important method known
as separation of variables. It will prove interesting to compare the various
stages of the solution with the answer, (10-6).
The desired solution (10-6) is a sum of terms each of which has the form
X(x)T(t). (10-8)
In the method of separating variables the idea is to construct functions
of the form (10-8) which satisfy the differential equation and the boundary
conditions. By superposition of these functions (10-8), one then satisfies
the initial conditions. The fact that there is a solution of the type (10-6)
gives good reason for expecting the method to succeed.
Substituting (10-8) into (10-1) yields
XT' = a 2 X"T,
where the prime denotes differentiation with respect to the appropriate
variable. Dividing by XT we get
V
T
(10-9)
The variables x and t in (10-9) are separated , in that the left side is a function
of t alone and the right side is a function of x alone. It follows that each
side must be constant, independent of both x and t. A brief investigation
of the effect of changing sign in (10-10) shows that XT can satisfy (10-3)
only if the constant is zero or a negative number-* p 2 . Thus,
rpt
— * -~p 2 > « ~p 2 . (io-io)
Independent solutions of (10-10) are 1
T = e~~ p2i ; X = cos - x, X = sin~x. (10-11)
a a
The boundary condition w 2 (0,0 = 0for?4 = XT requires that X'(0) * 0,
and hence the appropriate choice of X in (10-11) is
A = cos - x.
( 10 - 12 )
Similarly, the condition u x (l,t) ~ 0 gives X'(t) = 0, so that
nra
V
l
(10-13)
*It is suggested that the reader compare XT at this and subsequent stages with the
general term of (10-6).
S®0. 10] SOLUTION BY SERIES 461
where n is an integer. By (10-11)-(1(M3) we see that the function
T(t)X(x) « e -< nwmll)H cos ~ (10-14)
satisfies the differential equation and the boundary conditions. To satisfy
the initial conditions we form a superposition of terms (10-14). The
resulting series is precisely the series (10-6), and the solution is completed
as before.
The merit of the separation method is that it produced the functions
cos (nrx/l) by direct consideration of the differential equation. If some
other functions had been more appropriate, the method would have pro-
duced those other functions instead. This fact will now be illustrated by
an example.
According to Newton’s law of cooling, a body radiates heat at a rate
proportional to the difference between the temperature u of the radiating
body and the temperature Uo of the surrounding medium. Thus, if our
insulated rod of length 1 has the end x - 0 maintained at temperature 0
while the other end radiates into a medium of temperature uq = 0, the
corresponding boundary conditions are
n(0,0 « 0, u x (l,t) = — ), (10-15)
where h is constant. [The second condition (10-15) states that the rate
of flow — ku x is proportional to u(l,t) — 0, and this agrees with Newton’s
law.] If h * 0, there is no radiation and we have the condition for an
insulated end as discussed prc\iouslv. But if h > 0, which we now assume,
the problem is essentially different from those considered hitherto. The
difference results from the fact that (10-15) cannot be satisfied in any
simple way by an ordinary Fourier series.
Actually, as we show next, the appropriate functions for the problem
(10-1), (10-2), and (10-15) are not sin (mrx/1) or cos (m rx/I) but arc sin (3 n x,
where the fi n H are the positive roots of the transcendental equation 1
(3 cos (31 = ~h sin (31. (10-16)
Although one could hardly expect to discover the sequence sin fi n x by
a priori considerations, it is produced automatically by the method of
separating variables. The solution to the problem is found to be
u(x,t) = Xc n e~ a2 ^n &ml3 n x, (10-17)
1 Since the equation is equivalent to tan fil » -0/h when h & 0, its roots can be
obtained graphically by considering the intersection of the curves y « tan fi and y
~0/h. Of. Example 2, Sec. 2, Chap 9.
462 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
where c n is given in terms of the initial values by
f f(z) sin 0 n x dx
c n » (10-18)
/ 8m 2 fi n xdx
Jq
To obtain this solution by separating variables, observe that the substitution
u «» X(x)T(t) leads to functions of the type (10-1 1), exactly as in the former case. Here,
however, the condition w(0,0 « 0 gives X(0) — 0, so that we require the sine rather
than the cosine. The resulting expression
T(t)X(x) - e~ p2 ‘ sin - x
a
becomes e~ aSf3 *‘ sin /3x (10-19)
if we set p * a/3, and this form will be more convenient for our purposes. The function
(10-19) satisfies (10-1) and the first boundary condition (10-15) for all values of the
constant 0. To satisfy the second condition (10-15) we must choose (3 so that
e-aWp co8 _ h € B[n 0 lf ( 10 - 20 )
and this leads to (10-16). The resulting functions
sin (i n x
satisfy both boundary conditions (10-15) and also satisfy the differential equation
(10-1). If a suitable superposition (10-17) is found to satisfy the initial condition, our
problem will be solved
Setting t * 0 in (10-17) gives
f{x) « Xc„ sin 0nX. (10-21)
As in Chap. 2, Sec. 22, Example 1, we can show' that the functions sin & n z are orthogonal
on (0,/), and hence the r n s are given by (10-18). The solution can be verified by the
method of Sec. 9 if f(x) admits an expansion (10-21). Since an analogue of Dirichlet’s
theorem holds for the sequence sin /3 n r, Eq. (10-21) is not a serious restriction on f{x).
PROBLEMS
1. If fix) ® g{t), where x and l are independent variables, show that }(x) and g(t)
are constant. Hint' Let t « to, a fixed value.
2 . Attempt to satisfy the conditions (10-3) by choosing a positive constant -fp 2
instead of — p 2 in (10-10).
3. By using the functions (10-11) solve
ut * ce 2 u X x, w(0,0 ®= u(l,t) =» 0, w(£,0) ** fix).
4 . (a) Describe a physical situation which would lead to
u t « ct 2 u X x, u(0,t) *» u x (l f t) * 0, u(xfl) ** f(x).
(b) Solve by separating variables [cf. (10-11)]. (c) Verify that your result agrees with
(10-16M10-18) for h « 0.
6. Solve Prob. 4 by the method of images.
SOLUTION BY &EKIES
463
SEC. 11]
Outline of the Solution. Consider a rod of length 21 with ends at temperature 0. Let
the initial temperature /o(x) agree with f(x) on (0,1), and let / 0 (x) be symmetric about
x ® l (Fig. 12). By symmetry, no heat flows across the center, and hence the left
half of the long rod behaves like the rod of Prob. 4. The temperature uo(x,t) for the long
rod can be found from (9-10).
6. The vertical displacement u(x,t) of a vibrating string with fixed end points satisfies
Utt - a 2 o xx , «(0 ,t) = u(l f t) » 0.
By slitting u(x,t) « X(x)T(t) and separating variables, obtain solutions of the form
mr at mrx rnrat nxx
sin — j~ sin — and cos — j— Bin —j~ •
7. In Prob. 6, express u(x,t) as an infinite series if
u(xfl) ** f(x), n t (x, 0) « 0.
8. In Prob. 6, express u(x,t) as an infinite series if
« 0, u t (x, 0) « g(x).
11. Heat Flow in a Solid. By a procedure similar to that of Sec. 9 one
can establish the equation
k
U, = a 2 {u XI + u yy + U,,), a 2 -—. (11-1)
cp
for the temperature 1 u = u(x,y,z y i) in a uniform solid at time t. This is
the three-dimensional form of the equation
u t = at 2 u xx ( 11 - 2 )
obtained previously for heat conduction in a rod. The state of the solid
at time t = 0 gives the initial condition; the state of the surface for / > 0
gives the boundary condition. For instance, if the surface radiates accord-
ing to Newton’s law, the boundary condition is
du
— = e(u - u 0 ), ( 11 - 3 )
dn
where Uo is the temperature of the surrounding medium, e the emissivity ,
1 See the derivation in Chap. 5, Sec. 16. A similar equation governs diffusion and the
drying of porous solids, with u equal to the concentration of the diffusing substance.
Because of this analogy many problems on diffusion and heat conduction are mathe-
matically indistinguishable. The constant a 2 in (11-1) is often called the diffunvity.
PAKTIAL DIFFEHENTIAL EQUATIONS
464
[CHAP. 6
and du/&n the derivative in the direction of the outward normal. When
c*0, Eq. (11-3) means that the body is insulated.
Sometimes there is so much symmetry that
u in (11-1) does not depend on y or z . In this
case (11-1) is the same as (11-2), since the
terms u vy and u zz in (11-1) are zero, and the
analysis of Secs. 9-10 can be applied without
change.
As a specific illustration consider a uniform plate extend-
ing from the plane x « 0 to the plane x » d (Fig. 13).
Let u « uq on the surface x «* 0 and u *» ui on the surface
x * d, where uo and u j are constant. If the plate is infinite,
or if the edges are fai away from the points being con-
sidered, the symmetry suggests that u depends on x only
and, hence, that (11-2) holds. The steady-state temper-
ature is then given by Example 1, Sec, 9, as
x
u « uo + - (ui - uo).
a
Since the rate of flow is —ku x , the amount of heat Q
flowing across the area A in t sec is
ktA
U 0 — U\
If the flow of heat is steady, so that u is independent of time, then u t
and (11-1) reduces to
u xx + Uy V + u zz = 0. (11-4)
This is known as Laplace's equation; it occurs in a
variety of physical problems. The corresponding
two-dimensional form is
Uxx + Uyy « 0, u = u(x,y ). (11-5)
To illustrate the use of (11-5) we shall discuss
the steady-state temperature in an infinitely long
metal strip of width d (see Fig. 14). If the sides
of the strip have the temperature zero and the
bottom edge has the temperature /(x), the boundary
conditions are
Fia. 14
u(0,y) * 0 , u{d y y) * 0, u(xfi) ** /(x). (11-6)
We assume besides that (11-5) holds for 0 < x < d, y > 0.
It is a surprising, fact that these conditions do not suffice to determine
1
BBC* 11] SOtUTION BY SERIES 465
the temperature. 1 However, one expects the temperature to approach zero
as one moves away from the bottom edge, so that
lim u(x,y) * 0 uniformly in x . (11-7)
y — + oo
If this condition is explicitly required, the solution can be shown to be
unique (see Sec. 24).
Although the problem can be solved very simply by Fourier series, we
prefer to show how the desired functions are generated by the method of
separating variables. The choice u =* X(x)Y(y) in (11-5) gives
X"
T
( 11 - 8 )
after dividing by XF. Since the variables in (11-8) are separated, each
side is a constant. The boundary conditions applied to XY show (after
some calculation) that the constant must be a negative number ~p 2 , and
hence (11-8) gives
Since (— p) 2 = p 2 , we can assume that p > 0 with no loss of generality.
Linearly independent solutions of these equations are, respectively,
cos px, sin px and e vv , c ~ w .
Since u(0,y) — 0 requires that X (0) ~ 0, we reject the cosine, and in
view of (11-7) we reject the solution e pv . Hence the function XY takes the
form
XY = e~~ vv sin p.r. (11-9)
The boundary condition u{d y y) ~ 0 gives p = mr/d, where n is an
integer. Forming a linear combination of the resulting solutions (11-9)
we get
, ... mrx
u(x,y) = 2r n e- (n ’ r/d)1 ' sin (1 1-10)
d
and the condition w(x,0) « /(x) now shows that the c n s are the Fourier
coefficients
2 rd mrx
c n “ - / /(*) sin — — dx.
d J o d
The solution can be verified, if desired, as in Sec. 9.
1 The trouble is that the other end of the strip must be taken into account even though
it is infinitely far away. This purpose is served by (11-7).
466 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
The foregoing derivation obscures an important point which will now be discussed
more fully. Although the solutions
e** and e”™
can be chosen for the equation Y " — p 2 Y , these are not the only possibilities. Another
pair of independent solutions, for example, is
cosh py and sinh py.
If, now, we try to decide which of these functions satisfies (11-7) it will be found that
neither one does.
What is really involved is the following: The general solution of Y" « p z Y is
Y * 4. be-rV'
where a and b are constant. By (1 1-7) we get a =* 0, and hence Y ■ The reader
can verify that if
Y ■* do cosh py -f b Q sinh py,
the condition (11-7) will give ao -f bo *= 0, and again Y is a multiple of e~ pv . Similar
remarks apply to the construction of X(x) and to the derivation of (10-14).
Just as in the case of the rod, this problem involving a strip can be given
a three-dimensional interpretation. That is, the strip need not be thin
provided there is no variation of temperature across its thickness. By
letting the thickness approach infinity, wo got a semi-infinite plate. (In
Pig. 14 the plate extends infinitely far toward and away from the reader;
the area outlined in the figure is the cross section of the plate, not a frontal
view.) The boundary-value problem for the plate is
u xx + u vv + u zz = 0 , 0 < x < d } y > 0, — oo < z < oo, (11-11 )
u{Q,y,z) = 0, u(d,y,z) = 0, u(x,0,z ) =/(*), (11-12)
lim u{x,y,z) = 0 uniformly in x and z. (11-13)
y — y ao
If we assume u independent of z , the resulting problem is the same as that
formerly considered, hence has the unique solution (11-10).
The fact that u(x,y,z) is independent of z does not follow from the physical symmetry
but requires the condition (11-13). Indeed the function
. V® . rry , .
u «■ sin sin — - e * Tt/a
d d
satisfies (11-11) and (11-12) with f(x) = 0 and yet depends on z. Reduction of the
dimension by omitting a variable is really an application of uniqueness. If we verify
that (11-10) satisfies the problem (11-11) to (11-13), and that the problem has no other
solution, then it is true that u must be independent of z.
PROBLEMS
1* A refrigerator door is 10 cm thick and has the outside dimensions 60 by 100 cm.
If the temperature inside the refrigerator is — 10 °C and outside is 20 °C, and if k 0.0002,
SBC. 12] SOLUTION BY SERIES 467
find the gain of heat per day across the door by assuming the flow of heat to be of the
same nature as that across an infinite plate.
2. If/(x) ** 1 and d » r, show that (11-10) gives
w«-fc~ v 8inj-f - e~* v sin 3 as *f - sin fir -j V
it \ 3 5 /
6. A semi-infinite plate 10 cm in thickness has
its faces kept at 0°C and its base kept at 100°C.
What is the steady-state temperature at any point
of the plate?
6. The faces of an infinite slab 10 cm thick are
kept at temperature 0°C. If the initial tempera-
ture of the slab is 100 °C, what is the state of the
temperature at any subsequent time?
7. A large rectangular iron plate (Fig. 15) is
heated throughout to 100°C and is placed in con-
tact with and between two like plates each at 0°C. Fig. 15
The outer faces of these outside plates are main-
tained at 0°C. Find the temperature of the inner faces of the two plates and the
temperature at the mid-point of the inner plate 10 sec after the plates have been put
together. Given: a =* 0.2 cgs unit. Hint The boundary and initial conditions are
*«M) - 0, ii(3,f) - 0, w(jt,0) - f(x),
where /(j) * 0 for 0 < x < 1 and 2 < x < 3 but f(x) = 100 for l < x <2.
12. The Dirichlet Problem. The Laplace equation
^xx T* T w tz ~ 0 (12-1)
was obtained in Sec. 1 1 for steady-state heat flow. We shall show how the
same equation arises in electrostatics and gravitation. 1
It is a consequence of Coulomb's law that the potential due to a point
charge q at {?uy\ } Z\) is
q
u ~ - taking u = 0 at r = (12-2)
r
where r is the distance from the charge to the point (x,y,z) at which u is
computed. Thus,
r 2 - (* - Xi) 2 + (y - 2 /i) 2 + (2 - *i) 2 , r > o. (12-3)
1 A more complete discussion is given in Chap. 5, Sec. 14. The relation of Laplace’s
equation and fluid flow is developed in Chap. 5, Secs. 15 and 17, and in Chap. 7, Sec. 19.
468 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
The potential due to a distribution of n point charges g* is given by addition,
(12-4)
<- i n
and the potential due to a distribution of continuous charge of density p
in a body r can l>e obtained from an expression like (12-4) by passing to
the limit.
It is easily shown that 1/r satisfies Laplace’s equation (12-1), and hence
the same is true of u in (12-4) provided no r t is zero. This latter condition
means that there is no charge at the point of observation. One would
expect, therefore, that the potential due to a continuous charge distribu-
tion will also satisfy (12-1) if there is no charge at the point of observation.
Tliis is actually the case, and that is the reason why Laplace’s equation
plays such a prominent role in electrostatics. Although a more sophisti-
cated treatment may be given, it all comes down to the same thing; namely,
1/r satisfies (12-1), and the potential is given by some sort of superposition
process applied to 1/r.
Since the gravitational potential satisfies (12-2) (where q is the mass of
the attracting mass point), the study of gravitation also leads to Laplace’s
equation. In view of its many applications, the Laplace equation (12-1)
is profitably regarded as a field of study in its own right. Such a study
leads the way to a branch of analysis known as potential theory .
An important problem in potential theory is the Dirichlet problem ,
which can be stated as follows: Suppose given a body r in {x,y,z) space,
together with assigned values /(x,t/,z) on the surface of r. Find a function
u which satisfies Laplace’s equation in r and is equal to f(x,y,z) on the
surface. The foregoing discussion gives a number of physical interpreta-
tions. For instance, if u is temperature, the Dirichlet problem is to find
the steady-state temperature in a uniform solid when the temperature
on the surface is given. But if u is the electrostatic potential, the problem
is to find the potential inside a closed surface when the potential on the
surface is known. Interpretations in terms of diffusion, fluid flow, and
gravitation can also be given.
Since solutions of Laplace’s equation are often called harmonic functions,
Dirichlet’s problem can be stated as follows: Find a function which is
harmonic in a given region and assumes preassigned values on the boundary.
In two dimensions a harmonic function u(x,y) satisfies
Uxx d~ n yy ** 0. (12-5)
The region in Dirichlet’s problem is now a plane region, and its boundary
is a curve. The physical interpretation refers to phenomena in a thin
plane sheet, or it refers to three-dimensional phenomena which show no
dependence on z. The latter condition is to be expected when there is
/
SEC. 12J SOLUTION BY SERIES 469
cylindrical symmetry, that is, when all planes z « const exhibit the same
geometry and boundary conditions.
We shall now solve the Dirichlet problem for a circle. It turns out that
the problem is greatly simplified by use of polar coordinates appropriate
to the circular symmetry. With
x * r cos 0, y = r sin 0, u(x,y) m U{rfi) >
an elementary calculation shows that (12-5) becomes
(rU r )r + - Uee = 0 (12-6)
r
(see Prob. 2). The boundary condition can be expressed as
U(R,S) - /(0), (12-7)
where /(0) is a known function of 0 and R is the radius of the circle.
For each value of r it is clear that U has period 2w in 0, since u is single-
valued, and therefore U has a Fourier series
a 0 (r)
U (r,0) h 2[a n (r) cos nB + b n (r) sin nB]. (12-8)
2
Proceeding tentatively, we substitute (12-8) into (12-6) to obtain
^ + 2,[(ra n ) r cos nd (rb' n Y sin nB]
2(a n n 2 cos nB + b n n 2 sin nB) ~ 0.
r
Since the coefficients of cos nd and of sin nd must vanish,
( ra' n y = - n 2 a n , n = 0, 1, 2 , . . .,
r
(rb' n y « ~n 2 b n , n = 1, 2, 3, ....
r
These equations are both of form
r(ry')' = n 2 y
which is readily solved by the method of Chap. 1, Sec. 30. Specifically,
the substitution y « r a gives
r(ar°y » n 2 r a ,
whence a * dbn. Since a n (r) and 6 n (r) must be finite at r « 0, the minus
sign is excluded, and
(?
<*n(r) * a n r n ,
b»(r) - 6«r“
470 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
where a* and b n are constant. Hence by (12-8)
do
V ( rfi ) » — + S(a n r n cos n£ + b n r n sin n0). (12-9)
2
Putting r « R and using the boundary condition (12-7) give
do
f(6) = 1- 2(a n R n cos n$ + b n R n sin nO). (12-10)
2
If/(0) has a convergent Fourier series, the validity of (12-10) is ensured by
choosing a n R n and b n R n to be the Fourier coefficients of /:
1 rv
a n R n « - / cos n4>
ir J ~~ 7r
(12-11)
1 n r
?> n jR n = - / /(</>) sin 7i<f> d<t>.
7T J — T
The problem is now solved, but a simpler form can be found as follows:
Substituting (12-11) into (12-9) gives
1 n ri /r\"
w) =-/_[-+ 2 y cos »(•-♦)
m d<t>, ( 12 - 12 )
when we note that
cos nO cos n<p + sin nO sin n<j> — cos (r/,0 — n<£)
and interchange the order of summation and integration. The series in
brackets in (12-12) can be summed as in Chap. 2, Sec. 17, Prob. 0. The
result is the Poisson formula for a circle 1
W'-sC
R 2 — r 2
2tt •'-» R 2 — 2r/t’ cos (8 — 4>) 4- r
;/(<*>) d4>. (12-13)
If /(<£) is piecewise continuous and bounded, one can differentiate under the integral
sign for r < R to find that (12-6) holds. Also, it can be shown that (12-13) gives
lira U(rfi) ~ f($) (12-14)
T — * H —
provided / is continuous at 0, Hence (12-13) is a solution. In view of the derivation,
it is remarkable that (12-14) holds even when the Fourier series for/ does not converge
to/.
The expression (12-13) gives the steady-state temperature of a thin
uniform insulated disk in terms of the temperature at the boundary. Or
1 Another derivation is given in Chap. 7, Sec. 21.
SOLUTION BT SERIES
471
SEC. 13]
(12-13) can be interpreted as giving the temperature in a circular cylinder
when the temperature of the surface is /(0) independent of z. On the other
hand the formula also gives the electrostatic potential in terms of its
values on the boundary, and so on.
Example: Let u(x,y) be harmonic in a plane region, and let C be a circle contained
entirely in the region. Show that the value of u at the center of C is the average of the
values on the circumference.
Without loss of generality we can take the center to be at the origin. Equation (12-13)
then gives, with r — 0,
“(0,0) - U(0,e) - ~J m d*. (12-15)
Since /(<*>) stands for the values of u on the boundary, this is the required result.
PROBLEMS
1. (a) Verify that 1/r in (12-2) satisfies the Laplace equation (12-1). Hint: rr x *»
x ~ xi. Using this, find (l/r)*,. ( b ) Verify that logr satisfies the two-dimensional
Laplace equation (12-5).
2. If u(x,y) U(r,d) with x « r cos 0, y ■» r sin 0, show that
Un + Uyv- r~ l {rU r )r +
Hint: U r * u x cos 0 -f Uy sin 0, U$ * u x (— r sin $) *f u^r cos 0). Similarly, compute
(rllr)r and (t/,)*.
3. Derive (12-10) by considering U * R(r) 0(0) and separating variables.
4. Give two physical interpretations of the following Dirichlet problem for a semi-
circle, where w(x,y) «*■ U(r,8) as in (12-6):
Uzx + Uyy - o, X s -f- y 2 < 1, y > 0,
( 7 ( 1 , 0 ) - g(8), 0 < 0 < »,
r/(r,0) * U(r,t r) - 0, 0 < r < 1.
5. Solve Prob. 4 by the method of images. Hint: For 0 < 0 < w, define f{9) m g(0) t
/( — 0) “ —g{8) and use (12-13).
6. Obtain a formula analogous to (12-13) for the region r > R. (Assume that | U(rfi) j
is bounded as r — ♦ «o and, hence, that positive values of n in the discussion of the
text may be rejected.)
7. Interpret the result of Prob. 6 physically in terms of an infinite metal plate with a
hole whose edges have a prescribed temperature.
13. Spherical Symmetry. Legendre Functions. Let it be required to
determine the steady-state temperature in a uniform solid sphere of radius
unity when one half of the surface is kept at the constant temperature 0°C
and the other half at the constant temperature 1°C. By the discussion
of Sec. 11, the temperature u within the sphere satisfies Laplace's equation /
u xx + u wv + U B g =» 0. (13-f
,/
1
472 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
Symmetry suggests the use of spherical coordinates (r,0,4>) with origin
at the center of the given unit sphere (Fig, 16). Since
x « r sin 0 cos
y = r sin 0 sin <£,
2 ; «» r cos 0,
Laplace’s equation can be shown to be 1
r(rU)rr + ( Ue sin 0)o esc 0 + esc 2 0 = 0, (13-2)
where u(x,y,z) = U(r,0,<t>). If the plane separating the unequally heated
hemispheres is the xy plane, the symmetry suggests that U will be inde-
pendent of 4>, so that (13-2) becomes
T(rU)rr + (Ue sin $)$ esc 0 = 0. (13-3)
The boundary conditions are
u = 1 for 0 < 6 < Ylv, when r = 1,
(13-4)
u = 0 for < 0 < when r = L
We shall use the method of separating variables. Substituting the form
U - «(r)O(0)
into (13-3) gives two ordinary differential equations,
r(rR)'' - aR * 0,
(0' sin 0)' esc 0 + a0 = 0,
(13-5)
1 See Chap. 6, Sec. 13, or proceed as in Prob. 2 of the preceding section.
BOLtmON BT SERIES
473
BBC. 13]
where a is an arbitrary constant. The first of these equations can be Bolved
by assuming that R — r"* as in Chap. 1, Sec. 30. One obtains the linearly
independent solutions
R « r m , R = r“ (m+1 \
where m satisfies the quadratic equation
m(m + 1) = a. (13-6)
Changing the independent variable in (13-5) from 0 to x by mejtnw of
X = cos 0, 0(0) = P(x),
and replacing a by the expression (13-6), we get Legendre’s equation
(1 - x 2 )P" - 2 xP' + m{m + 1)P = 0, ' = — • (13-7)
dx
When m is a nonnegative integer, a solution of (13-7) is the Legendre
polynomial P m (x) = P m (eos0). Thus, one is led to consider solutions
of (13-3) which have the form
r m P m (cos9) or r^^P^cos 6).
The second of these expressions is rejected because it becomes infinite
as r — > 0, and we attempt to build up the desired solution u by forming a
series
GO
« = 23 A m r m P m (cos 0). (13-8)
0
Each term of this series satisfies (13-3).
When r = 1, Eq. (13-8) becomes
00
« = 23 A m P m (CO8 0), r *= 1, (13-9)
m«*0
and if it is possible to choose the constants A m in such a way that (13-9)
satisfies the boundary condition (13-4), then (13-8) will be a solution of
the problem. Since x = cos 0, the boundary condition requires
CO
F(x) = 23 A m P n (x), (13-10)
m*»0
where F(x) ® 0 for —1 < x < 0, and Fix) * 1 for 0 < x < 1. Now, it
was stated in Chap. 2, Sec. 22, that the expansion (13-10) is possible for
suitably restricted functions F{x) and that the coefficients are gi
= ^ + j_ x F(x)P m (x) dx.
[chap. 6
474 PARTIAL DIFFERENTIAL EQUATIONS
By means of this formula, the solution is found to be
U~H + HrPli cos e) - K6r 3 P 3 (cos 6) + %^ 6 (cos #)-•••■
It is possible to establish that (13-8) is actually a solution, though the demonstration
requires a detailed knowledge of Legendre functions. 1 The uniqueness theorem es-
tablished in Sec. 24 shows that there is no other solution, and hence the foregoing
procedure can be justified. In particular it was permissible to take m as a nonnegative
integer and to use the polynomial solution of (13-7) rather than one of the infinite-series
solutions.
PROBLEMS FOR REVIEW
1. As an infinite series, express the steady-state temperature in a circular plate of
radius a which has one half of its circumference at 0°C and the other half at 100°C.
2 . By (12-13), find the temperature of the plate considered in Prob. 1.
8. By separating variables in polar coordinates find the steady-state temperature in
a semicircular plate of radius a if the bounding diameter is kept at the temperature
0°C and the circumference is kept at the temperature 100 °C.
4 . Interpret the following Dirichlet problem physically, and solve:
Uxx *f ■“ 0, 0<3C<l,0<y<l,
u{0,y) rn w(l,y) - u(x,0) m 0, «(*,1) - f{l).
5. Derive (13-5) from (13-3).
14. The Rectangular Membrane. Double Fourier Series. Let a uni-
form elastic membrane be stretched
over a fixed, plane, bounding curve
(Fig. 17). To explain what is meant
by the tension , we consider the force
AF exerted by the membrane on one
side of a small straight slit of length
As. The membrane is said to be
under uniform tension T if this force
is directed perpendicular to the slit in
the plane of the membrane and has
magnitude T As independent of the
location and orientation of the slit.
A similar definition applies when
the membrane does not lie in a plane except that we must let As — * 0:
Ai o As
The role taken by the plane of the membrane in the first case is now taken
t>y the tangent plane at the point in question.
1 One must show that the series obtained by differentiating (13-8) are uniformly con-
vergent f or r < 1 — 5 and that the boundary condition is verified as r — ► l.
SOLUTION ST SBBIBS
SBC, 14]
475
Let the coordinate system be so chosen that the bounding curve of the
membrane lies in the xy plane. The vertical displacement of any point in
the membrane at time t is denoted by u — u(x,y,t). To obtain a dif-
ferential equation for the motion, we consider a small, nearly square portion
of the membrane bounded by vertical planes through the points
(s,y,0), ( x + Ax, y , 0), (x, y + Ay, 0), (x + A x, y + Ay, 0)
(see Fig. 18). Applying Newton's law to the small portion gives the ap-
proximate equation
T
u H =* y 2 (u xx + Uyy), y 2 * — t (14-1)
P
where p is the surface density. This equation describes small oscillations
of the freely vibrating membrane. Its derivation is similar to the cor-
responding derivation for a vibrating string (Sec. 2).
The problem of the vibrating membrane is solved when we have found
the solution of (14-1) which satisfies appropriate initial and boundary
conditions. We shall now consider the case of a clamped rectangular
membrane with sides of lengths a and b (Fig. 19). The boundary conditions
art ' u ® 0 for x *= 0 and for x = a, 0 < y < b,
(14-2)
u * 0 for y *= 0 and for y ** b, 0 < x < o.
To determine the solution uniquely we also specify the initial displacement
and initial velocity:
u{x,y,Q) - f(x,y), u t (x,y, 0) = g(x,y). (14-3)
478
PARTIAL DIFFERENTIAL EQUATIONS
[CHAP. 6
The assumption that
in (14-1) yields
t* - X(x)Y(y)T(t)
(14-4)
upon division by XYT. Since the variables are separated, the terms in
(14-4) are constant. It can be shown that these constants are negative,
so that we may write
X" 2 Y" T" 2
se — » —a 2 , sm — (*),
X Y T
with y 2 (p 2 + y 2 ) * o> 2 by (14-4).
Since X H + p 2 X ** 0, the function X(x) is a linear combination of
sin px and cos px. The cosine is rejected because the condition u ~ 0 at
x =» 0 gives X(0) = 0, and we must have p = tmr/a , where m is an integer,
because the condition u « 0 at x * a gives X(a) = 0. In just the same
way it is found that
Y = sin yy,
where # * rvw/b for an integer n. Thus, the desired oscillation has the form
mrx niry
sin sin {A cos o> mn t + B sin u mn t),
a b
where A and B are constant and where w m « = a? is given by
(14-5)
The functions (14-5) satisfy the differential equation and the boundary
condition. To satisfy the initial conditions (14-3) we try a superposition,
using different constants A and B for each choice of m and n:
~ mwx nrry
u(x y y f l) “ JL C 4mn COS o) mn t + B mn sin Wnnt) sin sin — — (14-6)
m,n— >1 U 5
Since the initial displacement is f(x,y), we must determine A mn so that
/(*,»)
mxx niry
2, A*,* sin sin——
wi.n—l O 0
Multiplying this double Fourier series by sin (vhtx/q) sin {nry/b) and inte-
grating over the rectangle give the formula
4
•b
mrx
SOLUTION BY SERIES
477
SEC. 14 ]
just as in the corresponding discussion for single Fourier series (Chap. 2,
Sec. 18). Similarly, differentiating (-14-6) with respect to t and setting
t « 0 give
4 r* rb mrx niry
Bmn - -7 / / Bin sm — dx dy
abu> mn J ° Jo a b
when we use the second initial condition (14-3).
The general term of the series (14-6) is a periodic function of time with
period 2ir/<*w The corresponding frequencies
Wmn
~2t
cps
(14*7)
are called characteristic frequencies , and the associated oscillations (14-5)
are called modes. The fundamental mode is the mode of lowest frequency,
obtained by setting m = n = 1.
Similar terminology applies to the vibrating string (Secs. 2-6). If the
length of the string is a and the equation of motion is
u tt « y 2 u xx ,
the characteristic frequencies may be written in the form
(14-8)
analogous to (14-7). The modes are described by
mrx
u = sin (^4 cos co m t -f B sin u) m t)
a
and the fundamental is the mode obtained for m « 1. The three-di-
mensional analogue of (14-7) and (14-8) is discussed in Prob. 2.
In Sec. 8 it was shown for the vibrating string that the characteristic fre-
quencies agree with the resonant frequencies, and a similar behavior is
found for vibration phenomena in general. It is also true in general that
the vibration can be expressed as a superposition of individual modes.
This fact is illustrated by (14-6) and by the Fourier-series solution for the
vibrating string.
The behavior of the vibrating membrane differs from that of the string in one respect.
For each characteristic frequency of vibration of the string the corresponding mode is
such that the string is divided into equal parts by the nodes whose positions are fixed.
When a membrane oscillates with a given characteristic frequency, there are also points
on the membrane which remain at rest. Such points form nodal Hrm . The position
and the shape of the nodal lines, however, need not be the same for a given frequency.
478 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
As an illustration consider a rectangular membrane with a — b, The frequency equa-
tion (14-7) then yields
tamn -“Vm*Tn*
a
« aVm* Hh n* « » — • (14-9)
a
For m » n «■ 1, we get from (14-6) the fundamental mode
Uu — (An cos 4* Bn am o n t) Bin — sin — >
a a
where «n «■ o\/2. Since ttu « 0 for all t only when x ® 0, y « 0, x « a, y « a, there
are no nodal lines in the interior of the membrane for this frequency. If we take m » 1,
n ** 2 and m *» 2, « =» 1, we get two modes:
rX , 2ir^/
*» (-4 is cos 4- Bn sm wiaf) sin — sm »
a a
2ttZ try
t*si - (4 si cos a>sif + Bji sin a> 2 i<) sin — sin — »
a a
with the same frequency, since u> 2 i «= u>u = a\/5. For y * a/2, t * A 2 * 0 and for x *
a/2, n«i ** 0. These nodal lines are shown in Fig. 20. By forming linear combinations of
the modes in (14-10) we can get oscillations with the same frequency but with different
nodal lines. Thus, if we take An * An « 0 and form uu 4- uu, we get
</I5
at*** /I?
n»=cot /IS
BBC. 14]
SOLUTION BT SERIES
479
tin 4* u®i m sin uud ( Bn sin — sin ----- + Bjj sin — sin
\ a a a a/
— (sin o»nf)2 sin — sin — ( Bn cos h B 2 i cos — 1 •
a a \ a a /
For this oscillation the nodal lines in the interior of the membrane are determined by
« ~ irx
Bn cos h Bfl cos — — 0. (14-11)
a a
Equation (14-11) for Bn * #21 yields the nodal line z 4- y * a and for Bn * -Bj 1 ,
the line x — y «** 0 (see Fig. 20). Different nodal lines can be obtained by forming dif-
ferent linear combinations of the modes (14-10).
The reader will show that for m * n «* 2, all oscillations have the same nodal lines
x » a/2, y « a/2, while infinitely many different nodal lines can be obtained by form-
ing different linear combinations of the modes iq* and un. A few of these are shown in
Fig. 20.
Since the nodal lines may be regarded as the boundaries of new membranes contained
in the original one, the character of oscillation of membranes of different shapes can be
deduced from the examination of nodal lines (see Prob. 3).
Nodal lines can be observed experimentally by sprinkling a fine powder on the vibrat-
ing membrane.
PROBLEMS
1. Suppose the initial conditions for the rectangular membrane considered in the
text are
u(x,y, 0) — 0.1 sin ~ sin u t (x,y,0) « 0.
a b
(a) What is the frequency of the oscillation? (b) What is the maximum opeed attained
by the mid-point of the membrane?
2 . Analysis of a microwave resonant cavity leads to the equation
u t t — y +yw -f u«)
with the boundary condition u ** 0 or du/Jn « Oon suitable portions of the planes
£* 0 , x**a, j/ 0, y «* b, z *» 0 , z — c
(see Fig. 21). By assuming u ** XYZT show that the
characteristic frequencies are
Wmnp
IT
[©'♦©♦©T
where m, n, and p are integers.
3. A curve in the xy plane along which u — 0 for all t is
called a nodal line . (a) Sketch the nodal lines for the oscill-
ation (14-5). (6) Sketch the nodal line for the oscillation
( . rx . 2ry . 2*x . ry\ .
am — sm ~ — h sm sm — ) (A cos ws\t 4- B sm a*i0
a 6 a b /
Fia. 21
which arises by adding the modes m — 1, n ** 2 and m ■* 2, n * 1. Hint ; sin 26 »
2 sin & cm 0. (c) Thus obtain one solution for the problem of a triangular membrane.
PARTIAL DIFFERENTIAL EQUATIONS
{chap, 6
16 . the Circular Membrane, Bessel Functions. To discuss the oscil-
lations of a circular membrane with fixed edges we introduce cylindrical
coordinates ( r,$,u ). With
u(z,y,t) « U(rfi t t) (15-1)
the equation of motion (14-1) takes the form (cf. Sec. 12, Prob. 2)
V U - y 2 [U r r + T~~ l U r + T~~ 2 Um], (15-2)
If the boundary is the circle r ~ a, then the boundary condition is
t7(a,0,<) = 0. (15-3)
To make the problem definite we also introduce initial conditions
U(r,0fl) = /(r), Ut(r,$fl) - 0 (15-4)
which state, respectively, that the initial shape of the membrane is given
by f(r) and that the initial velocity is zero.
Since the initial shape is independent of 6 , the solution presumably in-
volves r and t only. Thus, we consider expressions of the form R(r)T(t)
When applying the separation method. Substituting into (15-2) gives
1 d 2 T ~ 2 /l d 2 R 1 dR\
TlF ~ y \Rdl + 7Rd^)
(15-5)
after division by RT . Since the left-hand member of (15-5) depends on t
alone and the right-hand member on r alone, each side must be constant.
It can be shown that the constant is not positive, hence may be written
as — o 2 . Thus (15-5) leads to
+
So
II
©
/ f
dt
(15-0)
R" + r~ l R' + k 2 R = 0 ,
d
' * — .
dr
(15-7)
where k « v/y.
Equation (lS-fi'l is the familiar equation for simple harmonic motion,
and Eq. (15-7) can be reduced to the Bessel equation by the substitution
x » kr. Hence, (15-7) has a solution
R 581 J q{x) *** </o(&r).
The other solutions of (15-7) are rejected because they become infinite
at r * 0, and we are led to the functions
Jo(kr) sin wl or Jo(kr) cos uL
Since u t * 0 when t m 0, we reject the solution involving the sine. The
SBC. IS]
SOLtTTION BY SERIES
481
boundary condition (16-3) applied to our elementary solution RT now
Jo(ka) cos u>t 0
for all t. This requires that ka be a root of the equation Jq(x) = 0 (see
Fig. 22). If the positive roots of Jq(x) are denoted by x n , the appropriate
choices of k are given by
tions have the form
k n ~ x n /a. Sir.ce w = ky, our elementary solu-
RT = Jo(k n r) cos k n yt.
These functions satisfy the differential equation (15-2), the boundary
condition (15-3), and the second initial condition (15-4). To satisfy the
first initial condition we try to represent U as a linear combination of such
terms:
U » 2 A n J 0 (k n r) cos k n yt. (15-8)
n=*l
When l * 0, the initial condition requires that
00
f(r) = E A n J 0 (k n r).
«=* 1
The problem of expanding an arbitrary function in series of Bessel functions
was discussed in Chap. 2, Sec. 22. It was shown that the coefficients are
given by 2 r«
<1M)
provided the series is uniformly convergent (but see also Chap. 2, Sec. 23).
In the terminology of the preceding section, the solution (15-8) is ex-
pressed by means of the modes. The characteristic frequencies are
Wn Ky x n y
2r 2r 2ra
and the fundamental is described by Jo(k x r) cos k x yt.
PARTIAL DIFFERENTIAL EQUATIONS
[CHAP. 6
482
PROBLEMS
1. The oscillations of a cylindrical resonant cavity satisfy
uu « y\ux* + %+ u Zjf ), 0 < r < a, 0 < * < 6
with boundary condition u * 0 on the curved surface, u z ** 0 on the plane ends. Obtain
solutions of the form R(r)Z(z)T(t ) for this problem,
2. Find the distribution of temperature in a long cylinder whose surface is kept at
the constant temperature zero and whose initial temperature in the interior is unity.
3. An elastic membrane subject to uniform gas pressure satisfies the equation
u it + V * 7 2 (w« + %*),
where p is a constant depending on the pressure. If the membrane is circular, show
how to reduce this problem to a problem of the type solved in the text. Hint' Consider
the function
C7(r,9,0
SOLUTION BY INTEGRALS
16. The Fourier Transform For many partial differential equations
the desired solution can be expressed as an integral involving the initial
or boundary values. This possibility was already illustrated by formula
(3-10) for displacement of a vibrating string and by the solution of the
Diriehlet problem given in (12-13). We shall now describe a systematic
method of obtaining integral formulas.
The function g(s) defined by
1 fa
T f * lim ■ := / e lzs f(x) dx * g(s) (1G-1)
q — 4 op “y/ 2w *
is called the Fourier transform of f(x) ; the operator T is called the Fourier
transform operator. The inverse operator T _1 is obtained by changing
the sign of i, so that the foregoing equation may also be written
1 f a
I J= lim — 7 == / e' x ’g(s) ds = f(x). (16-2)
When such is the case, the symbol T satisfies the easily remembered equa-
tions
TT* 1 / - f, T-'T/ - /. (16-3)
If the limits in (16-1) and (16-2) are regarded in the sense of mean convergence (Chap.
2, Sec. 23), and if the integrals are regarded as Lebesgue integrals (Appendix C), then
(16-1) gives (16-2) and (16-2) gives (16-1) provided either of the integrals
BBC. 16] SOLUTION BY INTEGRALS 483
is finite. 1 Both integrals (16-4) then have the same value. In many physical problems
the common value represents the total power or energy present in the system.
To illustrate the use of the Fourier transform, we shall solve the problem
U| - a 2 u xx ,
t > 0, —00 < X < 00 ,
(16-5)
s — s
i!
o
£
— 00 < X < 00,
(16-6)
u(z,t) — > 0 ,
as t — > oo.
(16-7)
Physically, this system describes the temperature u(x,l) of an infinitely
long bar at point x and time t when the initial temperature u(xfi) is known.
The trial solution u » e px + 9t with p and q constant leads to
qe px + qt « a 2 p 2 e p *+ qt
when substituted into (16~5). Hence q * a 2 p 2 > and the trial solution is
e 1>x+a*p l t'
We choose p 2 negative because of (16-7). Thus p « is, where s is real,
and the trial solution is now
e i,x-a'.H = c i,x e -a',*t. ( I (>-8)
We shall satisfy the initial condition (16-6) by forming a linear combina-
tion 2 of solutions (16-8). Thus
_L e *»* e - a, 'Vs)
\/2ir
is a solution of (16-5) no matter what value g(s) may have, and the integral
u(z,t) = -J== f e % * x e-* iait g(8) ds
V 2x •'—oo
is also a solution, provided we can differentiate under the integral sign.
By (16-2) the latter expression can be written
u(x,t) = T -‘e— “**Vs). (16-9)
Setting t = 0 and using the initial condition (16-6) give
fix) = T - 1 0 (s),
1 This important theorem, known as PlanckereVs theorem , is proved in E. C. Titch-
marsh, “Introduction to the Theory of Fourier Integrals,” Chap. 3, Oxford University
Press, London, 1937. For a heuristic discussion of the relation between (16-1) and
(16-2) see Chap. 2, Secs. 20 and 21, of the present text,
* This procedure is analogous to the formation of Fourier series in the method of
separating variables.
484 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 8
so that Tf « TT ~~ l g ** g. Substituting into (16-9) we get the final answer,
u(x,t) « (16-10)
This is an explicit formula for the temperature u(x,t) in terms of the
initial temperature /(x).
As another example we shall solve the Dirichlet problem for a half
plane. Several physical interpretations were given in Sec. 12; the mathe-
matical formulation is
"1" Uyy
= o,
y > 0 , —oo < x < oo,
(16-11)
u(x, 0)
= fix),
— 00 < X < 0O y
(16-12)
u(x,y)
- o,
as y — * oo.
(16-13)
The function e px+9V satisfies (16-11) if p 2 + g 2 = 0. We choose q real
and negative because of (16-13), and hence p is pure imaginary, p = is.
The trial solution is now
e )
when we note that g 2 = $ 2 and that q is negative. This function satisfies
(16-11) and (16-13). To satisfy (16-12) we form a linear combination as
in the previous example, thus:
1
u(x,y) = -~= / ds m T ~ l e~ u ^g.
v 2 tt * —so
Setting y = 0we get / = T -1 g by (16-12). Hence g = T/, and the solu-
tion 1 is
u(x,y) * (16-14)
As a final example we shall consider the problem
u tt = a 2 u xx , u(x, 0) = /(x), u t (x, 0) = g(x)
which describes waves on an infinite string (cf. Secs. 2 and 3). The trial
solution e ux + qt yields the two expressions
e ™ x e iaat an( J e i*x e ~i<i8t t
Forming a general linear combination, we get
u(x,t) ~ T~V a ‘Vi (s) + (16-15)
where g\ and g 2 are to be determined from the initial conditions.
If (16-15) can be differentiated under the integrals which are implied
in the result is
* T^uue^ffiis) - ( 16 - 16 )
1 Formulas of the type (16-10) and (16-14) are discussed in R. M. Redheffer, Operators
and Initial-value Problems, Proc. Am, Math. Soc. t 4 (August, 1953)*
485
SEC. 16] SOLUTION BY INTEGRALS
Setting t ** 0 in (16-15) and (16-16) now gives, respectively,
f = T ~ l gi + T“Va» 0 “ T -1 *os0i — T~ 1 iasg a
or, after operating on the equations with T,
01 + 02
Solving for Q\ and g 2 ,
1
T/, 108(0! - 0 2 ) = T0.
01
■Tf + —~Tg,
2 2 xcls
1 1
02 - - V — — - T0,
2 2tas
(16-17)
and this gives the final answer upon substitution into (16-15).
The foregoing result can be deduced from d’Alembert/s formula (3-15).
However, the method of Fourier transforms also applies when d’Alembert’s
method fails (cf. Probs. 1 and 2).
PROBLEMS
1. According to Sec. 7 the equation for damped motion of waves on a string is
u t t — a z u** <■ —2 bu t .
Obtain a family of solutions of the type
«(*,<) - + T- 1 e-* , e-‘( 6, -» v >^j(s)
by starting with u ** e l,x + qi and forming a linear combination.
2. Formulate appropriate initial conditions for Prob. 1, and use them to determine
gi and pa-
3. The displacement u(x } t) of a long, stiff rod satisfies
EI ^ mc * m f ° rCe ’
when the mass is negligible (cf. Sec. 2, Prob. 6). Let U(s,t ) be the transform of u with
respect to the variable x, and let F(s,t) be tho transform of /. Neglecting convergence
questions, show that EI* A U *» F, and thus obtain the solution in the form
u(x,t) - (A’A)- 1 T- 1 s*~ 4 T/.
Hint: Write out the expressions
u(x f t) - T-'UM,
f(x,t) - T - 1 F(bA
in full, and substitute into the differential equation.
4. If the mass of the rod in the preceding example is m, the equation of motion is
d 4 u d 2 u
El 4" wi -
dx 4
dt*
■toO-
Show that u “ T~ l f7, where U ■» J7(s,0 satisfies the ordinary differential equation
486 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 8
17, Waves to a Half Plane* The Fourier transform can be used to solve
the two-dimensional wave equation
y 2 (u xx + Uyy) * u Ui y = const, (17-1)
and the result has an interesting physical interpretation. We suppose
that the time dependence is harmonic, so that
u(x,y,t) - V{xa,)e-**, (17-2)
where <*> is constant. Substituting in (17-1) gives the scalar wave equation
U xx + Uyy + k 2 U ~ 0, k - -• (17-3)
7
This equation will now be solved in the half plane y > 0 subject to the
additional conditions
U(z,0) - f(x) } U(r 9 y) -> 0 as y «. (17-4)
Physically, the solution describes the radiation field of an antenna 1 when
the aperture illumination is /(r) (see Fig. 23).
By substituting the function e %xsJtqv into (17-3) we obtain solutions
c .« e ±i (17-5)
Because of the second condition (17-4) the coefficient of y in (17-5) has a
negative real part when s is large; we shall indicate this by dropping the
1 Formulation of (17-3) in this context and discussion of the conditions for a unique
solution can be found in treatises on electromagnetic theory.
SBC, 17] SOLUTION BY INTEGRALS 487
minus sign. Forming a linear combination of expressions (17-5) as in the
preceding Section,
U{x,y) - — j= r da «
y/2* •'-»
For y * 0 the first condition (17-4) yields g = T/, and hence
U(x,y) - T-^^T/. (17-6)
Multiplying by e*-" 1 "* as in (17-2), we get the corresponding solution of
(17-1).
The discussion of (17-1) given here contrasts to that given in connection with the
vibrating membrane (Sec. 14). For the membrane we specified the initial values of
u and u t , and we obtained a series involving infinitely many oscillation frequencies.
Here, on the contrary, the frequency «/2» was prescribed in advance. By (17-2) the
initial conditions are
- U(z,y) t utfayfl) « -iuU(x,y)
and the first condition cannot be specified arbitrarily, inasmuch as U (x,y) satisfies (17-3).
To interpret the solution physically, we have
u(x,y,t) « T~ l e~ iut+ ' v '^~* l g(s)
by combining (17-2) and (17-6), with T f *= g(a). Writing out in full,
u(x,y,t) - ds . ( 17 - 7 )
For simplicity we shall suppose that g(s) = 0 when \s\> k. The limits
(-- 00 , 00 ) of the integral can then be replaced by (— k t k). If we now in-
troduce a variable B ,
8 « k sin 6, y/k 2 — $ 2 *» kcosd, (17-8)
we get
u(x,y,t) - f rl2 e* (ix *+* •— °fl(fc sin 0)fc cos 0 d9.
\/2w 1 *73
This formula expresses the solution as a superposition of functions of the
form
sin 6+kv cos 0— (17-9)
In the next paragraph it will be seen that (17-9) represents a plane wave
traveling with velocity y in the direction 0 (Fig. 23). Hence, the Fourier
transform procedure gives the plane-wave expansion of the antenna field .
The amplitude of the wave moving in a given direction 0 is
g(k sin 0) k cos B dB }
where p(s) is the Fourier transform of the aperture illumination.
488 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 8
To see that (17-9) represents a plane wave we examine the (x,y) locus on which (17-9)
is constant* Without essential loss of generality we confine our attention, in particular,
to the locus along which the exponent is zero. The equation of this locus is
x sin $ 4* y cos 6 ** ~ t «■ yt, (17-10)
when we divide by k and replace k by its value (17-3).
The wave front (17-10) is clearly a straight line, and the wave fronts for different
ts are all parallel. In fact, their common perpendicular makes an angle $ with the y
axis. Since the distance from the line (17-10) to the origin is yt, the velocity of propaga-
tion is the constant y, and hence the desired result is established. It is possible to give
a similar discussion for the part of the integral (17-7) with |s| > k, though we shall not
do so here.
18. The Convolution Theorem. The convolution f * g of two functions
/ and g is defined by
f*g = lim ~ f fii)g(x -{)<#£ = - 4 = / /(f)fif(x - {) (18-1)
if the limit exists either in the ordinary sense or in the sense of mean con-
vergence. The importance of the operation (18-1) rests on the following
theorem:
Convolution Theorem. Let f, g, |/| 2 and \g\ 2 be integrable, and let
all infinite integrals be interpreted in the sense of mean convergence. Then
the product of the transforms equals the transform of the convolution. In
symbols,
T (f*g) « (T/)(T?). (18-2)
Although a complete proof requires knowledge of Lebesgue integration and mean
convergence, the result cao be made plausible as follows We have
TO. ,) . — f_ - 1) *] d.
provided the order of integration may be inverted. If the variable x in the inner integral
is changed to t « x ~ £ , we get
k £ [ fj^ mu ■“] **> « - (vs £ ,<8 *- «) (vs £*■>•-“ 4
and this is (Tf)(Tg).
By means of the convolution theorem some of the foregoing results
can be greatly simplified. Taking the transform of the formula (16-10)
with respect to x gives
Tu = e-“ v< T/
(18-3)
I
BBC. 18] SOLUTION BY INTEGRALS 488
for the temperature u(x,t) of a rod when the initial temperature is f(x).
By consulting a table of Fourier transforms or by using the result of Prob.
1,
e = T g(x), where g(x) = (2a 2 <)“ H e-*' / < 4 **«.
Hence, the result (18-3) may be written
Tu =» (Tf)(Tg) = T(f* g)
when we recall (18-2). Taking the inverse transform now yields
u(x,t ) = f * g — (4 xa 2 t)~^ f /(f)e” ( * _f), ^ 4a,<) df. (18-4)
' —00
The advantage of this formula is that it involves only a single integration
whereas (16-10) requires two integrations. Since the integral is rapidly
convergent when t is not too large, (18-4) is well suited for numerical
computation.
To obtain a physical interpretation of (18-4), let the rod have the initial
temperature zero except for a short piece on the interval (x 0 — xq + «)
(see Fig. 24). If Q cal of heat is uni-
formly distributed over this element |.2*|
of the rod, the corresponding initial s — 1
temperature / is given by *o
Q ** 2 ecpf, Xo~~€<X<Xo + € ^ IG ’ ^
where c is the heat capacity and p the linear density. By (18-4) the resulting
temperature at point x and time t is
u(x,t)
- I r° + ‘ <,-(*-{>*/ (4a*t) d t
cp(4xa 2 2) ^ 2e •' x o~«
Letting « — » 0 and using the mean-value theorem we get
9 g— (* — * 0 ) */ (4o*t) ^
Cp(4:ira 2 t) ^
(18-6)
This gives the temperature distribution for an instantaneous source of
strength Q at the point x 0 . Now, Eq. (18-4) represents the temperature in
the general case as a superposition of such sources . The source at z = {
has the strength
Q 33 cpf(Z) d£.
Am we shall see in the next section, this physical interpretation enables us
to solve a variety of problems in heat flow with the greatest ease.
400 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
Example: Let u(x,y) be harmonic for y > 0 and satisfy the additional conditions
t<(x,0) — f{x), u(x,y) -* 0 as y —► *>.
Show that u w given by the Poisson formula for a half plane:
u(x,y) - ? f f -~rr~ t di. (1M)
x (x - £)* + y *
Since this problem is the same as that in (16-11) to (16-13), the solution is given by
(16-14). Taking the transform of (16-14),
Tu « e-^nj. (18-7)
The convolution theorem can be applied if we express c~ ,,lv as a Fourier transform.
To this end we compute the inverse transform
T~ l c~W* - ~4 f= r *” x e-‘ v d8 / e^e'vds
y/2r Jo \/2 r J ^
+ JLV
\/2x \y — ix y -f xx)
a&$)
This shows that «■ Tjr, where g is the function (18-8):
1 2 y
9 “Vz*** + v*
The convolution theorem applied to (18-7) now gives u «= /* g, and that is the desired
result.
PROBLEMS
1. Let l(x) - T _1 c“ e#2 , where c is constant, (a) Differentiate, and integrate the
result by parts to obtain
dJ _ x
dx 2c
(b) Using Eq. (10-1), Chap. 8, find the value of 1(0). ( c ) Thus deduce the formula
T -i e ~e.* m (2 c)~ H e-* lic .
(In particular, e~ x * f2 and e~ fl/2 are transforms of each other.)
2. Obtain the temperature distribution for a rod extending from x • 0 to x ** <*>
if the initial temperature is /(x) and the end x * 0 is insulated. Hint: Consider a rod
extending from — *> to w, with initial temperature fo(x) defined by
fo(x) even, /o(z) « /(x) for x > 0.
Compare Prob. 5, Sec. 10.
& By taking fo(x) odd in Prob. 2, find the temperature distribution when the end
x — 0 is not insulated but is kept at the temperature zero.
4. A rod extending from x *» 0 to x «* l has the initial temperature distribution /(x).
By regarding this rod as part of an infinite rod with the initial temperature /o(x), find the
temperature u(x,t) when (a) both ends are insulated, (6) both ends are kept at the tem-
perature zero. Hint: Let /o(x) have period 2 L This method of satisfying boundary
conditions was used for the vibrating-etring problem in Sec. 6.
t
SEC. 191 SOLUTION BT INTEGRALS 491
19. The Source Functions for Heat Flow. According to (18*5) the
function
V e -r*/
pc(4*0L 2 t) **
x 2 ,
(19*1)
represents the temperature distribution due to an instantaneous source
of strength Q at the origin. Equation (19-1) applies to the one-dimensional
heat equation c?u xx = u t . The corresponding result for two dimensions
is
__0 -r W) r* - X s + y 2 , (19-2)
pc(4iror0
and for three dimensions it is
V e ~r*/4
pc(4ira 2 0 ^
r 2 » x 2 + y 2 + r 2 .
(19-3)
In these formulas r is I/ie distance from the source to the point of observation
and t is the length of time that has elapsed since the heat was released. The
value of p is, respectively, the linear, surface, or volume density.
The functions (19-1) to (19-3) are solutions, respectively, of
a 2 U*x « U t> a 2 (u*r + Uyy) « Ut, a\u xx + Uyy ' f U Mt ) - U t .
Also they give the limit 0 as l —► 0 through positive values, provided r ^ 0. Hence
the initial temperature distribution is concentrated entirely at the origin. By integrating
over the whole space it can be shown in each case that the total amount of heat present
is Q when t > 0. For these reasons, the physical interpretation as a 'point source oj
strength Q is fully justified.
The expressions (19-1) to (19-3) indicate that heat travels with infinite speed. Even
if r is large, we get a positive temperature for each positive t, no matter how small, but
the initial temperature was zero. By contrast, the disturbance associated with the
wave equation travels with finite speed (cf. Secs, 2 and 4.)
To illustrate the use of (19-3) let us find u(x,y,z,t) when the initial
temperature
u(x x ,y x ,z x> 0) = f{x h y u zi)
is given at each point (x x ,y x ,z x ) of space. Instead of this distribution we
introduce a source of strength
Q = cpf(x x ,y x ,z x ) dxi dy x dz x (19-4)
at (x x ,y x ,z x ). The temperature at point (x,y,z) and time f due to one such
source is given by (19-3), with Q as in (19-4) and with r the distance from
(x,y,z) to the source:
r 2 - (xi - x) 2 + ( y x - y) 2 + (z, - z) 2 .
PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
Hie temperature at (x,y,z) due to all the sources is given by superposition:
u(x,y,z,t) « (4*a 3 t)~ H [ f f e~ rVia, 'f(xi,yi^i) dx r dy t dz x . ( 19 - 5 )
J — oo J — co J — oo
As another illustration we shall find the temperature due to a point
source which emits heat continually. Let Q(t) represent the strength of
the source, so that the amount of heat emitted in time interval (ty, ty + dty)
is approximately Q(ty) dty, The heat at the present time t due to the source
at time U is nn \ /it
pc[4rra 2 (t — ^i)]^
when we recall that t in (19-3) stands, not for the time, but for the elapsed
time. Adding the contributions from the source at all values of ty prior
to t gives
u
-f
DC
pc [4?ra 2 (£ — ty)]**
- ,l) 1 Qih ) dh
(19-6)
If Q(t) is a constant Q, the integral can be evaluated explicitly by the change of
variable
r 2
the result is
u «
4o ! « - h) 1
Q I
4x0? pc r
G&"7)
This represents the temperature due to a continuous uniform source of heat at a distance
r from the point of observation. Since the conditions are steady state, the solution
satisfies Laplace's equation. (Compare Sec. 12, where the function 1/r was obtained
in connection with electrostatics and gravitation.)
Example: A line contact is pressed against the
plane x ** 0 with constant normal force F per
unit length, the coefficient of friction being a
constant p. At time t ** 0 it starts to slide in
a direction perpendicular to its length with con-
stant velocity v (see Fig. 25). Obtain the tem-
perature in the medium x < 0, assuming this
temperature to have been zero initially and
neglecting heat loss at the surface x « 0.
This problem arises in the theory of milling,
leather glazing, and lathe turning. To solve it,
let the line contact be initially coincident with
the z axis, so that its height at time ^ is y «■ vty.
The heat generated by friction per unit length
is Fp dy, and hence the heat generated per unit
Q mm Fpvdty,
Using this value of Q in the result (19-2) we obtain
e -l**+ /4a»(i-l,)
- h)
SOLUTION BT INTEGRALS
493
sec. 20]
for the contribution, at the point (x,y,z) and at the present time t, due to motion of the
line contact in the time interval (h, h -f dti). (The reader is reminded that t in (19-2)
stands for the elapsed time and x 2 -f y 2 in (19-2) is the square of the distance from the
point of observation to the line source.) Superposition yields the final answer:
- — y o — dh.
PROBLEMS
1. Show that (19-2) can be obtained by integrating (19-3) with respect to z and (19-1)
by integrating (19-2) with resect to y. Interpret physically.
2. What initial- or boundary-value problem is solved by (19-5)? By (19-6)?
3. By use of (19-2) solve the initial- value problem
oVx* 4 * %) « u u u(x,y, 0) « /(x,#).
4. Find the temperature distribution u(x f y,z y t) for x > 0 due to a time-dependent
distribution f(y t *,t) on the plane x ■» 0. Take the initial temperature as zero for x > 0
6, State and solve the two-dimensional analogue of Prob, 4; the one-dimensionaJ
analogue.
20. A Singular Integral. We shall now derive an integral formula which
can be used in the study of many partial differential equations. The
discussion depends on certain theorems of vector analysis 1 summarized
in the following paragraph.
In the divergence theorem
/(V.A)dv«/A n da,
dr =* volume element
da = surface element
(20-1)
the choice A *» uVv yields Green's first identity
r 0 r dv
/ [uV 2 v + ( Vn • Ur;)] dr = / u — da
Jr J<r dn
(20-2)
when we recall that (Vt>) n — dv/dn, the normal derivative. Writing
(20-2) with u and v interchanged and subtracting give Green's symmetric
identity
r n n r ( dv du\
I ( uV 2 v — vV 2 u) dr » / ( u v — ) da. (20-3)
J r J o \ dn dn/
The conditions for validity of these identities are discussed in Chap. 5.
For our present purposes we need an appropriate form of (20-3) when
v does not satisfy the continuity conditions there required.
1 The reader may find it advisable to review Chap. 5, Secs. 8 to 10. Unless otherwise
indicated, the functions considered in Secs. 20 to 22 are twice continuously differentiable
in the region r and on its boundary. The surfaoe of r is assumed smooth, so that the
normal is a continuous function of position.
494
PARTIAL DIFFERENTIAL EQUATIONS
[CHAP. 6
To this end, we show that
f M) , f4n KP), fore = 2,
a -* o An a e 10, for c < 2,
(20-4)
where / is continuous, c is constant, and the region of integration or is
the surface of a sphere of radius a centered at P. Now, the integral (20-4)
may be written
An a® An a e A»
/(P)
/(C) -/(P)
da = /i +
Since the area of the sphere is 4ira 2 , we have, as a — ► 0,
/i
a c
4?ra 2 — >
1 4ir/(F),
to,
for c = 2,
for c < 2.
Since a surface integral does not exceed the area of the surface times the
maximum value of the integrand, we have
r , _ 2 max |/(Q) -/(P) |
/ 2 1 < 4 wtr
<4* max |/(Q) -/(P)|.
If / is continuous, this tends to zero as a — ► 0, and
(20-4) follows.
Let us now apply (20-3) to the function
1
v m w + r « r(P,Q), (20-5)
r
where w is twice continuously differentiable and
where r is the distance from a fixed point P to the
variable point of integration, Q . The region of
integration is to be the region inside a given closed
surface a and outside a small sphere a\ of radius a
centered at P (see Fig. 26). In this region r & 0
and (20-3) can be applied without hesitation.
According to (20-5) we have
V 2 v - + V 2 - » V 2 w. (20-6)
On a i the outward normal n is directed along the radius into the sphere,
so that
- 1 * _ _ I 1
dnr dr r r* a 2
SEC. 21] SOLUTION BY INTEGRALS 495
Since r ® a on <ri, the foregoing equation and (20-5) give
Bv Bw 1 1
— 1 — v « w + -» on (T\. (20-7)
Bn Bn or a
The surface integral in (20-3) can be written as an integral over a plus
an integral over crj. By inspection of (20-7) the integral over vi is
f r / Bw i\ / i\dui
+ (2M)
'ITiis becomes 4iru(P) as a — ► 0, in view of (20-4). If we use this result
and (20-6) in (20-3), we obtain the desired formula
4 ru(P) = f ( uV 2 w — vV 2 u ) dr + f (v u — \ da (20-9)
\ Bn Bn/
upon letting a —> 0. When P is exterior to r, the same formula is valid,
except that 4 wu(P) must be replaced by 0.
Since the volume integral in (20-9) is taken over the whole region r, it includes the
point P at which v «* *>. The meaning of the integral is clear from the derivation,
but we shall show directly that a singularity of the type 1 fr in a volume integral causes
no convergence difficulties. If n is the interior of the sphere with surface c\ t we have
f - dr « f - 4 rr 2 dr » 2rd 2 .
Jti r Jo r
This is clearly finite and in fact tends to zero as a — ► 0.
21. The Poisson Equation. If u has continuous second derivatives, then
the Laplacian V 2 u is a continuous function of position. We shall denote
this function by —4 irp(z,2/,z), so that
V 2 w = —4 t p. (21-1)
The choice u * 1/r, te « 0 in (20-9) now yields the Poisson formula
r P 1 r /I Bu B 1\
u(P) ~ -dr + -/(--- u-Ada (21-2)
r 4 v \r Bn Bn rf
when we divide by 4ir. As before, r * r(P,Q) is the distance from P to
the variable point of integration Q . The formula (21-2) holds for every
function having continuous second derivatives in r and on its boundary. 1
We now change our viewpoint. Instead of starting with u and defining
p by (21-1), we suppose that p is given in advance. Equation (21-1)
is now a partial differential equation for the unknown function u; it is
1 Provided the boundary is simple enough to permit the use of (20-9). This condition
on r is hereby postulated once for all.
496
PARTIAL DIFFERENTIAL EQUATIONS
(CHAP. 6
called the Poisson equation . The foregoing considerations show that if u
satisfies the Poisson equation, then u is given by the Poisson formula.
The interest of the formula is that it yields the values of u throughout the
interior of r in terms of u and du/dn on the surface only.
For a physical interpretation, let u be the electrostatic potential due to a
charge distribution of density p. The fact that the potential satisfies
Poisson's equation is established in treatises on electrostatics, 1 so that this
interpretation is consistent with (21-1). Since g/r represents the potential
due to a charge q at a distance r from the point of observation, the term
1
- (p dr), where r = r(P,Q) and p = p(Q),
r
represents the potential at P due to the charges within the volume element
dr at Q. Hence the first term of (21-2),
represents the potential at P due to charges within the body r. Similarly
the second term in (21-2),
represents the potential at P due to a certain surface-charge distribution
on the surface a.
To interpret the term
/(£;)( -£■<■) < 2M »
in (21-2), we consider the configuration shown in Fig. 27. Here, a charge — g is in-
troduced at the point Q on the surface <r and a charge +q at a distance An along the
outward normal n to <r. The distance from —q to P is r, and the distance from q to P
is denoted by n. If we take q « m/An, where m is constant, the potential at P is
1 1 1 A(l/r) d 1
q q — ** qA - » m ► m
ri r r An dnr
as An — ► 0. [That the limit is m d(l/r)/dn follows from the definition of normal deriva-
tive, without calculation.] The limiting configuration of Fig. 27 is called a dipole;
the constant m is called the moment of the dipole. We have thus found the desired
interpretation of (21-3); namely, (23-3) represents the potential due to a surface distribu-
1 The case p - 0 in (21-1) is discussed in Chap. 5, Sec. 14. A detailed analysis of the
conditions under which Poisson’s equation holds may be found in 0. D. Kellogg, “Foun-
dations of Potential Theory," p. 156, Springer-Verlag OHG, Berlin, 1929.
SEC* 21] SOLUTION BY INTEGRALS 497
tion of dipoles having the moments ~u de/4r. A surface distribution of dipoles such
as this is called a double layer .
Since the volume integral in (21-2) is extended only over r, it does
not take account of the charges outside r. That purpose is served by the
surface integral in (21-2). From this viewpoint (21-2) shows that the
chargee outside r can be replaced by a suitable surface charge and double
layer on <r, without changing the potential within r. If r increases beyond
all bounds, the limiting value of the surface integral can be thought to
represent the influence of the charges at infinity.
In many important problems there are no charges at infinity, so that the
limiting value of the surface integral is zero. To investigate this possibility,
let <r be a large sphere of radius a centered at the origin 0. By inspection
of the differential triangle in Fig. 28,
Ar ^ An cos ~ An cos as An — ► 0
where ^ is the angle between OQ and PQ. The definition of normal deriva-
tive leads to
dr 4 Ar
— — lim — = cos ip
dn An
(21-4)
and hence
d 1 1 dr CQ8\p
dn r r 2 dn r 2
(21-5)
If P is fixed and a it is easily seen that r ~ a uniformly with
respect to the point Q on a. Hence 1/r has the order of magnitude 1/a,
and by (21-5) the normal derivative has the order of magnitude 1/a 2 .
Now, the surface integral in (21-2) does not exceed 4xa 2 times the maximum
498 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
of the integrand. By the foregoing remarks, the integral therefore tends
to aero as a *> if
— ► 0 and max \ u\ -* 0, as a — > », (21-6)
In this case (21-2) leads to the simple formula
u « f - dr, integrated over all space. (21-7)
* r
Referred to spherical coordinates 1 (r,6 f 4>) the normal derivative du/dn
in (21-6) is the radial derivative du/dr , and a == r. Thus (21-6) is equiva-
lent to
du
lim r — » 0, lim u * 0, uniformly in 6 and (21-8)
r rn dr r — ► «o
By substituting (21-5) into the integral and regrouping terms one finds that (21-8)
may be replaced by the weaker condition
( du \ u
r — -f- u ) « 0, lim - » 0, uniformly in $ and <t>. (21-9)
dr / r — ► so r
That is, if u satisfies (21-1) and (21-9), then u can be represented in the form (21-7).
When p » 0 outside a bounded region, an altogether different procedure 1 shows that
the second condition (21-8) also suffices.
Example: If the region r is a sphere of radius ro centered at JP, every solution of Pois-
son's equation satisfies
U<P) " l “ da + 1 ( ; _ “) ' dr -
Here r » r(P,Q) is the distance from P to the variable point of integration Q. To
prove (21-10) we choose w — —1/ro in (20-5) and note that on a
dv dv 1 l
P "" # dn dr r* ™ rj>
The desired result follows at once from (20-9). The special case p » 0 in (21-10) yields
the Average-value Theorem: If a function is harmonic throughout a sphere , its value
at the center of the sphere equals the average of the values on the surface . This fact is of
central importance in the study of harmonic functions.
The merit of taking w - — l/r 0 in (20-5) is that then v *» 0 on «r. Hence the term
involving du/dn in the surface integral (20-9) drops out. The possibility of making
such a choice of v will be systematically exploited in Sec, 23.
1 The r in (21-8) has no relation to the r m r(P y Q) that appears elsewhere in this
discussion.
•See H. B. Phillips, "Vector Analysis/' p. 158, John Wiley <fc Sons, Inc., New York,
1083.
a max
du
dn
sbc. 22]
SOLUTION BT INTEGRALS
t
PROBLEMS
1. If u ia harmonic, show that the choice u ■» c in (20-2) give*
jf + «?) dr - jf tt ^ dr.
2 . Show that & solution of Poisson’s equation in a closed bounded surface <r is wholly
determined by its boundary values and that it is determined, apart from an additive
constant, by the boundary values of the normal derivative. Hint: If tq and u% are two
solutions, apply Prob. 1 to u «* u\ — u* and then use the result of Prob. 4.
8. Let u be harmonic in a region r, and suppose u assumes its maximum value uq
at an interior point P. Show that u ** uo throughout every sphere contained in r and
centered at P. Hint : If M(u) denotes the mean value on the surface of such a sphere,
then uo ■* M(u) and hence M(uo — u) «■ 0. Now use Prob. 4.
4, Let f(Q) be continuous and nonnegative in a region r. If J f dr » 0, then/ m 0.
Similarly for surface integrals. Hint: If / • « > 0 at an interior point P, then by
continuity / > e/2 throughout some sphere r\ of radius 6 > 0 centered at P. But this
gives J f dr > J f dr > jf («/2) dr > 0.
22. The Helmholtz Formula. The Helmholtz equation
V*u + k 2 u - 0 (22-1)
is obtained by separating variables in the wave equation (Sec. 14) cr by
requiring harmonic time dependence (Sec. 17). A brief calculation 1 shows
that (22-1) has the solution e %kt /r i where r is the distance from a fixed
point P to a variable point Q. If we set
e xkr 1
v * — » w =*= v (22-2)
r r
it follows that V 2 w » V 2 v =* —k?v for r 0. Hence
uV 2 w — vV 2 u » u(—k 2 v) — v(—k 2 u ) = 0, (22-3)
provided r 0 and provided u satisfies (22-1). Substituting (22-2) into
(20-9) with due regard to (22-3) now yields the Helmholtz formula
m
i
4w
*• L dn dn V r / J
da.
(22-4)
Tlsl* expresses tbfi solution u of (22-1) as an integral involving the boundary
va%WB of u and du/dn.
Sometimes the region r is bounded and (22-1) holds at points exterior to
t. To see if (22-4) remains valid in this case, we construct a sphere t\
1 Let the Laplacian be referred to spherical coordinates with origin at P,
PARTIAL DIFFERENTIAL EQUATIONS
[CHAP. 6
centered at the origin and having a radius a so large that r is contained
entirely within n (Fig. 29). Formula (22-4) applied to the region between
t and the surface <r\ of the sphere gives
rfe^du B /e ikr \l r rv*
4ru(P) «/ u — ( — )\d<r + I —
r dn dn \ r / J r
^ r du
dn
d<r ,
In the first integral the outer normal for the region of integration is the
inner normal for r. With this understanding we see that (22-4) holds in
the present case, provided the integral over cr x tends to zero as a — *
To investigate the behavior as a -* », note that on <r\
d e* r d dr e** r / 1\
r* w ( lk I COS ^
dn r dr r dn r \ rj
by (214). Hence the integral over <r\ becomes, after rearrangement,
— iku + iku(l — cos f ) + - u cos J cUr.
(22-6)
As a — ► oo with P fixed, we have a ~ r. Also, the law of cosines applied to Fig. 28
shows that o 2 (l — cos ^) remains bounded. Hence, the integral will tend to 0 provided
a max
0 and max | w | — ► 0 f as <*—►».
This assumes k real, so that |« ftr | ** 1.
In spherical coordinates (r,6,4>) the result of the foregoing discussion
may be summarized as follows: Formula (22-4) applies to the exterior oj
the hounded region r provided
r
du
,dr
0
(22-6)
and w 0
SOLUTION BY INTEGRAL
SEC. 28]
SOI
as r — > », uniformly in 6 and 4>. In just the same way it is found that
(21-2) applies to the exterior of the bounded region r provided (22-6) holds
with A: ■» 0.
Equation (22-6) with km 0 is sometimes called the Dirichlet condition . It is the
same as (21-8), hence means that there are no charges at infinity. Equation (22-6)
with k & 0 is called the Sommer f eld radiation condition; it means that there are no
sources of radiation at infinity.
Although (22-6) is the form usually given, it is unnecessarily restrictive. A more
careful analysis of (22-5) shows that (22-6) may be replaced by the weaker condition
( du u\ u
— — iku *f ~ 0 and - — » 0, (22-7)
which reduces to (21-9) when k — 0.
23. The Functions of Green and Neumann. The Laplace equation can
be obtained by setting p *= 0 in Poisson's equation (21-1) or by setting
k = 0 in the Helmholtz equation (22-1). The corresponding integral
formulas, (21-2) and (22-4), both reduce to
<P)
d 1\
u ] do.
dn tI
(23-1)
This expresses every harmonic function in r as an integral involving the
boundary values and the boundary values of the normal derivative. How-
ever, a harmonic function is determined by the boundary values alone ,
without any reference to the normal derivative. 1 We shall now obtain a
formula, similar to (23-1), in which du/dn is not present.
Such a formula can be found by an appropriate choice of v in (20-9).
Since V 2 u » 0, the volume integral in (20-9) will drop out if
V 2 w * 0 throughout r. (23-2)
And the terra involving du/dn in (20-9) will drop out if v «= 0 on <r. By
(20-5), that condition is equivalent to
w » on a. (23-3)
r
Evidently, (23-2) and (23-3) determine w uniquely. Since the function
r as# r(P,Q) involves the fixed point P, the boundary condition (23-3)
makes w> and hence t\ depend on P. The function v obtained in this way
is called Green 1 8 function and is denoted by G(P,Q). Thus,
1
G(P,Q) « v ** w + -
r
(23-4)
PARTIAL DIFFERENTIAL EQUATIONS
502
[chap. 8
where r ** r(P,Q) and where w satisfies (23-2) and (23-3).
(20-9) now yields
u(P)
1 r dv If
— I u — d<T w / u
4% dn
dG
4r ** dn
d«,
The formula
(23-5)
with G given by (23-4). The differentiation and integration in (23-5) are
with respect to Q.
What we have shown is the following: Let u satisfy
V 2 w - 0 in r, u » / on or. (23-6)
If the region r has the Green function G, then
1 r dG
U (P) * f f(Q) — da . (23-7)
4* dn
When a continuous function / is given in advance, it can be shown, con-
versely, that the function u in (23-7) satisfies (23-6). In other words,
formida (23-7) solves the Dirichlet problem. The general Dirichlet problem
is thus reduced to the special Dirichlet problems 1 that have to be solved
in constructing Green’s function.
To interpret Green’s function physically, let a unit charge be placed
at the point P interior to a closed, grounded conducting surface a. Since
P is the only charge present, the potential
has the form v =» w + 1/r, where V 2 w — 0.
Since the conductor a is a grounded equi-
potential, v = 0 on cr, and hence, v agrees
with the v in the foregoing paragraph.
Thus y G(P } Q) is the potential at Q due to o
unit charge at P in the grounded conducting
surface <r. Because of this interpretation
the existence of Green’s function is verj
plausible on physical grounds.*
The physical interpretation not only suggest
that G(P,Q) exists but gives a method of findini
it in many cases. As an illustration, we shal
construct Green’s function for the half space z > t
(Fig. 30). Let a charge q « 1 be placed at P and a charge q ** —1 at Pi, the mirrc
image of P in the plane * * 0. By symmetry the potential v ■» 0 when t « 0, an
hence v is Green’s function. If r is the distance from P to Q and n the distant
from Pi to 0, the potential is
1 One problem for each choice of P.
* A proof of the existence for all regions likely to be met in practice is given in O. I
Kellogg, "Foundations of Potential Theory,” chap. II, Springer-Verlag QHG, Berlii
1929.
fisc. 23]
SOLUTION BY INTEGRALS
SOS
0(P t Q) -----
r n
As in the derivation of (21-5),
dO
dn
2 cos \ff
on * *» 0,
where ^ is the angle between PQ and the normal to * *» 0. Substituting into (23-7)
yields the Poisson formula for a half space ,
u(x,y t z)
cos ^
_
d<r
* r r i(x\,y\) .
2t J-^ [(* - x,) 1 + (y- vtf + i*J* ^ 2,1
This formula represents a harmonic function for z > 0, which reduces to f(x,y) when
t *■ 0 .
In terms of heat flow, the Dirichlet problem is to compute the steady-
state temperature in a solid when the temperature on the surface is known*
Sometimes the rate of heat flow across the surface is prescribed rather than
the temperature. The problem which arises in this way is called the
Neumann problem; it leads to the equations
V 2 u « 0 in r,
du
— = <7 on a.
dn
(23-8)
If (23-8) is to have a solution, we must restrict g so that the rate of
flow into r equals the rate out; otherwise, a steady-state temperature
cannot be expected. It is clear physically that the appropriate condition
is
/ g da - 0 (23-9)
J<r
and indeed, the choice v » 1 in (20-3) shows that (23-9) follows from
(23-8).
When g satisfies (23-9), the problem is still not well posed because it has
infinitely many solutions. That is, (23-8) involves the derivatives only,
so that u can be altered by an additive constant. To make the solution
unique we require that
£ u da « 0. (23-10)
Properly stated, the Neumann problem, is to solve (23-8) when (23-9) and
(23-10) hold.
By means of (20-9) we can develop a Neumann function N(P,Q) ana-
logous to the Green function G(P,Q ) of the foregoing paragraphs. As
before the condition
V 2 w « 0 throughout r
(23-11)
804 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 0
makes the volume integral (20-9) drop out. To get rid of the surface
integral involving u, we require that dv/dn be constant 1 on a, and we
recall (23-10). Since v * w + 1/r, this requirement is
dw 5 1
(23-12)
— a* j. const.
dn dn r
To make w unique we require also that
/ w da 0.
Jr
The Neumann function is
N(P,Q ) s v = w + -
(23-13)
(23-14)
r
where v> satisfies (23-11) to (23-13). The solution u of (23-8) can evidently
be expressed in the form
u(P) * — f gvda * — f g(Q)N(P,Q) da . (23-15)
4ir J * 4ir j <*
When g is given in advance, it can be shown, conversely, that the function
(23-15) satisfies (23-8). Hence, if we solve the particular Neumann
problems involved in the construction of N(P } Q), we can solve the general
Neumann problem for the region.
Physically, the Neumann function represents the heat flow due to a source of strength
4ir at P when the heat flows out at a uniform rate across the boundary. This shows
that the condition
dv
— - 0 on <r (23-16)
dn
analogous to (23-3) cannot be required in general; when the region is bounded, (23-16)
violates the principle of conservation of heat. For unbounded regions (23-16) is possible,
as we see by considering the Neumann function for a half plane ,
N(P,Q) - - + — (23-17)
r n
It is left for the reader to verify that (23-17) satisfies (23-16) on the plane r 0 and
to solve the Neumann problem.
ELLIPTIC, PARABOLIC, AND HYPERBOLIC EQUATIONS
24. Classification and Uniqueness. If a, b , and c are real continuous
functions of x and y, and if H is a continuous function of the indicated
arguments, the partial differential equation
Q&xx “f* 2 bz X y *4” (%yy 8=1 H (x^y fZjZxjZy}
1 See remarks at the end of this section,
(24-1)
SBC. 24] BJXtPTIC, PAHABOLTC, AND HYPERBOLIC EQUATIONS 505
includes many equations of mathematical physics. It is convenient to
classify equations of the type (24-1) according to the sign of the discriminant,
b 2 — ac f When b 2 — ac < 0, the equation is said to be elliptic; when
b 2 — ac * 0, it is parabolic; and when b 2 — ac > 0, it is hyperbolic . This
nomenclature is suggested by analogy with the conic
ax 2 + 2 bxy + cy 2 - H ) a, b , c t H const,
which is an ellipse, a parabola, or a hyperbola according to the sign of
b 2 — ac.
As typical illustrations, the reader can verify that
V'XX 4“ Uyy ** 0, Ux* :=: tiy, *Uxx ** 0 (24-2)
are elliptic, parabolic, and hyperbolic equations, respectively. The first
of these is Laplace’s equation; the second 1 is the equation for heat flow;
the third 2 describes the motion of waves on a string. The general equation
(24-1) in the elliptic, parabolic, or hyperbolic case has much in common
with the corresponding Eq. (24-2), and that is the reason why the classi-
fication is important. We shall now discuss the conditions for unique
determination of u.
Case I. Elliptic Equation. Physical considerations suggest that a
solution of Laplace’s equation is wholly determined by its boundary values.
That is, if U\ and w 2 satisfy
Uxx 4“ Uyy = 0 (24-3)
in a bounded region r, and if u x = u 2 on the boundary, then u x « u 2 in
r. A mathematical proof is readily given, assuming that the function
u = — u 2 is continuous in r and on the boundary.
Without loss of generality, let the region r lie between the lines x « 0 and x «* 1
(so that cos x 0 in r). If v is defined by
u « v cos x,
a short calculation shows that (24-3) yields
v xx 4* iw — 2v x tan x — v — 0. (24-4)
Suppose that v > 0 at an interior point Po. Then v assumes a positive maximum at
an interior point Pi (since f ® 0 on the boundary and v is continuous). At Pi we have
V > 0, V X » 0, V XX <0, Vyy < 0,
and hence (24*4) cannot hold. This contradiction shows that v < 0 throughout the
region. Similarly, v > 0, and hence v m 0. It follows that u m 0, as was to be proved.
The same method can be used in three dimensions; the only change is that (244)
1 Let y «* aH in (9-5).
* Let y m at in (24).
PARTIAL DIFFERENTIAL EQUATIONS
506
[chap. 6
has an extra term t> M . A decidedly less elementary proof (which applies also to the
Neumann problem) was given in Sec. 21, Prob. 2.
What we have actually shown is that if the harmonic function u satisfies
u < 0 on the boundary, then the same inequality holds at interior points.
Considering u — m instead of u yields the following significant result: 1
Maximum Principle. Let u be harmonic in a bounded region r and let
m be constant . If u < m throughout the boundary of r, then u < m through-
out T.
This theorem is true for the general equation 2 (24-1), provided (24-1)
is elliptic and H > 0.
Case II. Parabolic Equation . Let u be the temperature of a thin rod
extending from x « 0 to x = l. With y * a 2 t the equation of heat conduc-
tion is
« m V) 0 < x <1, 0 < y < oo. (24-5)
As typical initial and boundary conditions, we assume that
u(x,0) ~f(x), u(0,y) « g(y), u(l t y) * h(y). (24-6)
These conditions give the initial temperature and the temperature of the
two ends. In the xy plane, (24-6) specifies the value
of u on the boundary of a certain semi-infinite rec-
tangle (Fig. 31). The physical interpretation sug-
gests that u is thereby determined within the
rectangle, and we shall now show that this is, in
general, the case.
Let u\ and u* satisfy (24-5) and (24-6). The function
U » «i - Uj
then satisfies (24-5) and (24-6) with / «* g *» h * 0. For
simplicity, we shall suppose that u is continuous and bounded
in the region of Fig, 31 and on its boundary, though these
conditions could be weakened.
If v is defined by u ® ve v , substitution in (24-5) yields
v. (24-7)
Suppose that v — Vo > 0 at some point Po of the rectangle in Fig. 31. We know that
p * 0 on the three sides of this rectangle, and since u is bounded, the equation v ** e~ v u
shows that v < vq if y is large enough. It follows that v assumes a positive maximum
at some interior point Pi. At Pi,
e*x < 0, *>0
and hence (24-7) cannot hold. This contradiction shows that r <; 0, everywhere.
Similarly, t> > 0, and hence v m 0, It follows that u\ m u*.
*Cf. Sec. 21, Prob. 3.
1 See: H. Bateman, “Partial Differential Equations of Mathematical Physics,”
p. 135, Cambridge University Press, London, 1932.
SBC. 25] ELLIPTIC, PARABOLIC, AND HYPERBOLIC EQUATIONS 507
The method of proving this uniqueness theorem leads, just as in the foregoing discussion,
to a Maximum Principle: If u < m on the boundary of the rectangle in Pig . 31, then
u < m throughout the rectangle. A physical interpretation is readily given.
PROBLEMS
1. For what values of the constant k is u** -f ku^, +% « 0 elliptic? Parabolic?
Hyperbolic?
2. In what regions of the xy plane is
(1 4* V)uxz *f 2xu xv -f (1 - y)Uyy - u z
elliptic? Parabolic? Hyperbolic?
8. Show that the solution of the elliptic equation
u zx % *■ —ku
is not always uniquely determined by the boundary values. Hint: Let the region be
the square 0 x < *■, 0 < y < r, and separate variables. For a physical interpretation,
see Sec. 14.
4. A characteristic value for a region r is a constant X such that the problem
Uxx + % + Usm + hu « 0 in r, u * Oon the boundary
has a solution other than the trivial solution u « 0. Show that a characteristic value
is always positive. Hint: If u ?£ 0, then u has a positive maximum or a negative mini-
mum at some interior point P.
6. The semi-infinite strip 0 < x < *, y > 0 has its edges kept at the constant tem-
perature u «* 0, whereas its end y ** 0 is kept at the temperature u » sin x. In the
steady state the temperature u satisfies u zx 4* Uyv * 0, and also
u(0,y) « 0, u(x,y ) « 0, m(x,0) ® sin x .
(a) By the method of separating variables obtain infinitely many distinct solutions to
this problem. (6) Show that only one of these solutions satisfies lira u(x,y) 0 uni-
v -* «
formly in x. (r) If condition (6) is imposed, show that the problem has, in fact, only
one solution. Hint: Use the maximum principle.
26. Further Discussion of Uniqueness. Continuing the study of (24-1)
we consider the hyperbolic equation
Uxx Wyy * 0. (26-1)
Since solutions of (25-1) do not satisfy the maximum principle, the fore-
going methods cannot be used here.
Case Ilia. Hyperbolic Equaiion t First Problem. Let the value and
normal derivative of u in (25-1) be given on an interval (a, b) of the x
axis (Fig. 32). Thus,
w(x,0) «* /Or), u v (:r,0) * g(x), a < x <b.
If G(x) * f g(s) d $, d'Alembert’s formula (3-15) yields an expression
J a
2u(x,y) = f{x + y) + G{x + y) + fix — y) — G{x - y), (25-2)
which will now be used to discuss the uniqueness of u.
SOB PABTIAL DIPPERENTIAL EQUATIONS [CHAP. 6
By hypothesis, f(x) and G(x) are determined for a < x < b but not
outside this interval. Hence
f(x + y) 4* G(x + y) is determined for a < x + y < b,
(25-3)
f(x — y) — G{x — 2/) is determined for o < x — y < b,
but not elsewhere. In the xy plane, the loci
a <x + y <b, a <x~y <b
represent two strips, bounded by the two pairs of lines
x + y =* o, x + y ~ b and x — y ~ a, x — y « b (25-4)
(Fig. 32). Both expressions (25-3), and hence u in (25-2), are uniquely
determined in the intersection of these strips, but only there. This shows
that the region of determinacy is the doubly shaded region in the figure. 1
Similar behavior is found for the general hyperbolic equation, the role of the lines
(254) being taken by the charactemtias introduced in Sec. 29. It is often possible to
express u by an integral formula involving the initial value and normal derivative.
The method requires construction of the Riemann function* which is in some respects
analogous to the Green function of Sec. 23.
Case Illfc. Hyperbolic Equation f Second Problem . The equation
u x jc “ u vv has the general solution
u(x,y) ** fi(x - y) + f 2 (x + y) (25-5)
as was shown in Sec. 1. If u is given on two adjacent sides of the rectangle
in Fig. 33, we shall use (25-5) to show that u is determined in the whole
rectangle.*
1 Cf. Theorem I, Sec. 4.
. 1 See A. G. Webster, “Partial Differential Equations of Mathematical Physics/'
p, 248, Teubner Verlagsgesellschaft, Leipzig, 1927.
* Cf. Theorem II, 43ec. 4.
BBC* 25] ELLIPTIC, PARABOLIC, AND HYPERBOLIC EQUATIONS 509
Choose a point R in the rectangle, and draw RQ and RS parallel to the
sides of the rectangle, as in the figure. With P the apex of the rectangle,
x — y is constant on PQ and x — y is also constant on SR* Hence the
same is true of fi(x — y) :
fi(x — y) « a at P and Q , fi(x — y) * & at R and S.
Similarly,
f 2 (x + y) ** y at P and S, f 2 (x + y) » 5 at Q and J®.
By using these values in (25-5) we can verify the identity
u(R) ■» n(Q) + u(S) — u{P)* (25-0)
This shows that u is determined by the data at every point in the rectangle
and at no point outside the rectangle.
By a procedure known as Picard's method , the problem just discussed
can often be solved for the general hyperbolic equation (24-1). It is
supposed that the equation has the form 1
u xy = H(x,y,u,u Xf u v ) (25-7)
and for simplicity we assume the homogeneous boundary conditions
u(x, 0) - 0, u(0,y) = 0, 0 < x < a, 0 < y < b. (25-8)
Thus, u = 0 on two adjacent sides of the rectangle in Fig. 34. The con-
ditions (25-8) enable us to write (25-7) in
the form
u(x,y) =/ q j^H(x u y u u,u x ,u v ) dxidy x
where the arguments of u, u x , and u v in the
integral are x x and y x .
Picard’s method consists of choosing a first
approximation w <0) , evaluating the integral,
and using the result as the second approxi-
mation u (1 K A similar process yields w (2) ,
and so on. If u (n) is the nth approximation,
the next approximation is
u (n+n (x,y) = f rH(xi,yi,u (n) ,u< n) ,ul n) ) dx x dy t . (25-9)
J0 J 0
Subject to mild restrictions on H it can be shown 2 that the solution is
1 Bee Sec, 29, Case III.
I R. Oourant and D. Hilbert, “Methoden der mathematischen Physik,” vol. II,
p. 317, I. Springer- Verlag OHG, Berlin, 1937.
510
given by
PARTIAL DIFFERENTIAL EQUATIONS
(CHAP. 0
By (25-9),
u(x,y) — lim u {n) (x,y).
As an illustration, let the equation be
«* 1 -f u.
u (n+i) m r I* [i w (»)j dxi dyi xy + [* [* u (n) dxi dyi.
/o </o /o /o
Starting with u (0) «• 0, we get u (l) «■ :ry, « <2) » ary 4* <X|/) 2 /4 f
" < " - xy +fif’ [* m + £«£] dzid„i-xv+ ( ^ i + i -
and so on. Evidently, the process gives
u(x,y)
That this is a solution can be verified by actual substitution.
26. The Associated Difference Equations. Let A be a positive number
and u = u{x„y) a function of x and y. The difference operators A x and A v
are defined by
u(x + h,y) - u(x,y ) u(x, y + h) - u{x,y)
(xy)*
(30*
^(xvT
„_i(n0*'
A z u
(26-1)
Passing to the limit as A — > 0, we get the partial derivatives; that is,
lim A x u * w z , lim A v u ~ (26-2)
— *■ o * — o
when the limits exist. If the second differences are defined by
u(x + ft, 2/) — 2u(x,y) + u(x - A, y)
A 2
Aftf/u
u(x , 2/ + A) - 2u(x,v) + u(x, y A)
it can be shown, in general, that
lim A xx u **
A 2
lim AyyU «* Uyy
* — 0
(26-3)
(see Prob. 1). Hence the three difference equations
AxxU “■ AyyU ® 0, A XX U 88 AyUy A XX U "f" AyyU ® 0
as A — > 0 become the respective differential equations
U X X Uyy 0, U X X ” Uy, 4" Uyy ** 0.
The correspondence of difference equations and differential equations is important
because there are numerical methods of solving the former which are especially adapted
f
SEC. 26] ELLIPTIC, PARABOLIC, AND HYPERBOLIC EQUATIONS 511
to high-speed computers (cf. Chap. 9, Sec. 19). As h 0, the solution of the dif-
ference equation generally tends to the solution of the corresponding differential equa-
tion. This fact gives a means of numerical approximation which has been extensively
exploited. Because of space limitations we shall consider merely the determinacy of
the solutions, our objective being to clarify further the distinction among elliptic,
parabolic, and hyperbolic equations.
Case I. Elliptic Equation. Using (26-3) the reader can verify that
A xx u + A yy u «=» 0 (26*4)
can be written in the form
u(x t y) * %[u(z + h, y) + u(x — h, y) + u(x, y + h) + u(z, y — h)].
(26-5)
This equation gives a relation between the five values of u at the five
neighboring lattice points 1 illustrated in Fig. 35; in fact, the value at the
O'
\
r
i c
/
\ c.
\
V
7 \
/
7 t
\
0
V
X
r
i "
TT
Fig. 36
central lattice point is the arithmetic mean of the values at the four neighbors .
The corresponding property for Laplace's equation is the average-value
theorem given in the example of Sec. 21.
To state the Dirichlet problem for the difference equation (26-4) we
say that a point is interior to a region if its four neighbors are points of
the region. A boundary point is a point for which at least one neighbor
belongs to the region and at least one does not. For instance the points
• in Fig. 36 are boundary points. In the Dirichlet problem a function u
satisfying (26-4) is given at every boundary point, and it is required to
find u at the interior points. We shall now establish both the existence
and the uniqueness of the solution. 2
Suppose, then, that u is known at every lattice point bounding a given
region (Fig. 36). If we write down the equation (26-5) for each interior
1 That is, points of the form ( mh t nh ) with integers m and n.
* The region is assumed bounded, so that the number of interior points is a finite
number n.
512 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
point (%,y)> we obtain a system of n linear equations in n unknowns. It
will be seen presently that the determinant of this system is not zero, and
hence there is one, and only one, solution. On the other hand, if the values
are not prescribed at every boundary point, there are always more un-
knowns than equations, and the solution is not determined uniquely. These
properties are analogous to those obtained previously for the Laplace
equation.
To show that the determinant is not zero, we shall analyze the special case in which
the boundary values are zero. In this case the system of linear equations obtained by
'writing (26*4) at each interior point is homogeneous. If the determinant is zero, the
system will have a solution other than the trivial solution u m 0. Without loss of
generality we can suppose that this nontrivial solution u is positive at some point.
Let the maximum value of u over all the lattice points be denoted by m >0, and
let P be a point where u ~ m. Evidently P cannot be on the boundary, since u > 0
at P. Hence, P is interior, and the value of u at P is the average of the values at the
four neighbors. Now u < m at these neighbors, since rn m the maximum. If u < m at
any neighbor, then the average is <m t so that u(P) < m. This contradiction shows
that u ® m at the four neighbors of P.
We can now repeat the process, starting with one of these" four neighbors instead of P.
Proceeding in this way we find that u — m at every lattice point. But that is impossible,
since u » 0 3* m on the boundary. Hence the assumption that the determinant was
zero led to a contradiction.
PROBLEMS
1. (a) Show that A xx u(z,y) « A x [ A x u(x — h , y) J. (b) If u has a Taylor series expan-
sion about the point ( x,y ), show that A xx u u xx as h — ► 0. Hint: Use the first six
terms of u(x + h, y k) « a -j- bh ■+■ rA -f ■ • ■ .
2. Suppose AxxU + A w u *» 0, and suppose u is known for x = 0 and for x *= h
(y w h, 2 h y 3/t, . , .). In what region of the ry plane is u determined? Hint * See Fig, 35.
3. Let A xx u 4“ AyyU * 0, and suppose u(0 K v) * 1 for all y , u{2h,y) = 2 for all y,
u(h t 0) - u(h,4h) - 0. Find u(h,2h).
27. Further Discussion of Difference Equations. According to the fore-
going discussion, the elliptic case leads to a set of simultaneous equations
for determination of the unknown function u. In the parabolic and hy-
perbolic cases, as we shall now see, the values of u can be obtained suc-
cessively.
Case II. Parabolic Equation. By (26-1) and (26-3) the equation
A xx u a* A y u (27- J)
takes the form
u{x + h,y) ~ { 2 - h) u{x,y) + u(x ~~ h, y) « hu(x , y + h ). (27-2)
This shows that if u is known at the three coliinear points in Fig. 37, then
u can be found at the fourth point.
By analogy with the problem of heat flow discussed in Sec. 24, let u
be given at the points • on the boundary of the semi-infinite rectangle
513
SEC. 27] ELLIPTIC, PARABOLIC, ANO HYPERBOLIC EQUATIONS
in Fig. 38. Referring to Fig. 37, we see that u can be found at the lattice
points with y *■ h in the rectangle. Repetition gives u for y * 2 k, and
so on. Thus, u is determined throughout the rectangle, just as in the case
of Fig. 31. The process works equally well when the rod is infinite and
u is given at all the lattice points on the x axis.
- t
L(*,y+ h)
(*-
- r - f
V
-A,y)
f-
7
A*+Ky)
t c
h
m
■
)
Fia. 37
.
y
n
—
p-i
— '
—
r
■
■
■
■
■
■
■
■
■
■
■
■
r
■
h
■
■
m
m
H
i
E2
nr w w w i
Fia. 38
Because the pattern of Fig. 37 points upward, so to speak, it is impossible to proceed
in the negative y direction when the rod is infinite. The very first step leads to a system
of infinitely many equations in infinitely many unknowns. Inasmuch as y « ct 1 t > where
t is time, this fact expresses the irreversibility of thermodynamic processes.
Further insight into the one-directional character of i is given by (9-10) and '\8"4).
In general, these expressions are infinitely differentiable for t > 0 but divergent for
t < 0. This behavior of the heat equation contrasts to that of the wave equation. As
we have repeatedly observed and will see again in the sequel, the latter is meaningful
for negative t.
Cask Ilia. Hyperbolic Equation, First Problem. Writing the equation
A x ~u — A vy u = 0
(27.3)
in the form
u(x + h,y) + u(x — h,y) — u(x y y + h) ~~ u(x y y ~ h) ~ 0 (27-4)
we see that the corresponding pattern is that
shown in Fig. 39. If u is given at any three
of the four lattice points, then (27-4) gives u
at the fourth point. Inasmuch as the pattern
is symmetric, one can proceed in the positive
y direction and in the negative y direction
with equal ease.
To discuss the analogue of the initial-value
problem (Sec. 25, Case Ilia), let u and A y u be
given in an interval of lattice points on the x
axis. This is equivalent to specifying u itself on two adjacent rows of lattice
points, as indicated by the black dots in Fig. 40. Considering Fig. 39 in
t
i
Jx.y+h
2
u-
\
-h t y)
V _ ,...
)
/
1
h
/
t
\
A*,y-h
\
r r
Fia. 39
PARTIAL DIFFERENTIAL EQUATIONS
514
[chap. 6
conjunction with Pig. 40, we see that the region of determination for u
consists of the lattice points in the square. The analogy with Fig. 32 is
evident.
Case III&. Hyperbolic Equation , Second Problem . If u is given on two
adjacent sides of a rectangle as shown in Fig. 41, we can apply Fig. 39,
starting at P . It will be found that u is determined at the indicated
points O, and at no others. This behavior corresponds to that found in
Sec. 25, Case III6.
PROBLEMS
1. In Fig. 38 let u «■ 0 on the vertical rows of points •, and u « 1 on the horizontal
row •, for 0 < x < l. Assuming h « 1 in (27-2) find u(h,Qh).
a. In Fig. 40 let u * 1,0, 0,2, 0,0, 3, 0,0 on the bottom row of points • (in order), and
let &yii ** 0 at these points. Find u(3/i,57t).
3. Let u(P) « 0 in Fig. 41. Find the value of u at the opposite corner if u * 0 at
the points • on the left of P, and u « 2 at the points • on the right of P.
28, An Example: Flow of Electricity in a Cable. Many physical prob-
lems lead to an equation that changes type,
according to the values of the physical
parameters. Since the character of the solu-
tions undergoes a corresponding change, this
phenomenon has great practical importance.
As an illustration we shall consider the flow
of electricity in linear conductors (such as
telephone wires or submarine cables) in
which the current may leak to ground.
Let a long, imperfectly insulated cable (Fig. 42) carry an electric current
whose source is at A. The current is assumed to flow to the receiving end
Fig. 42
SBC. 28] ELLIPTIC, PARABOLIC, AND HYPERBOLIC EQUATIONS 515
at P through the load B and to return through the ground. It is assumed
that the leaks occur along the entire length of the cable because of im-
perfections in the insulating sheath. Let the distance, measured along
the length of the cable, be denoted by x; then the emf V (volts) and the
current I (amperes) are functions of x and t. The resistance of the cable
will be denoted by R (ohms per mile), and the conductance from sheath
to ground by G (mhos per mile). It is known that the cable acts as an
electrostatic condenser, and the capacitance of the cable to ground per
unit length is assumed to be C (farads per mile) ; the inductance per mile
will be denoted by L (henrys per mile).
Consider an element CD of the cable of length Ax. If the emf is V at
C and V + AV at D, then the change in voltage across the element Ax
is produced by the resistance and the inductance drops, so that one can
write
AV * — ^ IR Ax + — L A x^ *
The negative sign signifies that the voltage is a decreasing function of
x. Dividing through by Ax and passing to the limit as Ax — > 0 give the
equation for the voltage:
dV dl
— - -IR - L — (28-1)
dx dt
The decrease in current, on the other hand, is due to the leakage and the
action of the cable as a condenser. Hence, the drop in current A I across
the element Ax of the cable is
dV
A I * - VG Ax C Ax,
dt
dl dV
so that — * -FG - C (28-2)
dx dt
Equations (28-1) and (28-2) are simultaneous partial differential equa-
tions for the voltage and current. The voltage V can be eliminated from
these equations by differentiating (28-2) with respect to x to obtain
/*, - -V X G - CV tx .
Substituting for V x from (28-1) gives
/** - IRQ + LGI t - CV tx
from which V ix can be eliminated by using the expression for V xt obtained
by differentiating (28-1). Thus one is led to
I xx - JJClu * (LG + RC)I t + IRQ .
(28-3)
516
PARTIAL DIFFERENTIAL EQUATIONS
[chap. 6
A similar calculation shows that (28*3) is also satisfied by the voltage V\
Evidently, (28-3) is hyperbolic when LC & 0 but parabolic when LC = 0.
When the cable is lossless, R ~ G ~ 0. Equation (28-3) and the cor-
responding equation for V are, then,
I xx « LCI th V xx - LCV U . (28-4)
Comparing the equation for wave motion (Sec. 1), we see that the cable
propagates electromagnetic waves with velocity
a « (LC)-X
The hyperbolic equation (28-4) is appropriate if the frequency is high and
the loss is low.
For an audio-frequency submarine cable it is more appropriate to take
G ss £ = 0. The equations are then parabolic:
hx - RCI t , V xx = RCV t , (28-5)
Instead of representing waves, the propagation of V and I is now identical
with the flow of heat in rods. Comparing with (9-5) gives
a - (RC)-X
Example: Consider a submarine cable l miles in length, and let the voltage at the
source A, under steady-state conditions, be 12 volts and at the receiving end R be 6
volts. At a certain instant t m 0, the receiving end is grounded, so that its potential
is reduced to aero, but the potential at the source is maintained at its constant value
of 12 volts. Determine the current and voltage in the line subsequent to the ground-
ing of the receiving end.
It is required to find V in (28-5) subject to the boundary conditions
V(0,t) « 12, V(l,t) - 0, t > 0. (28-6)
The initial condition is
F(x,0) - 12 - 6 - (28-7)
since the steady-state solution of (28-5) is a linear function of x (Sec. 9, Example 1).
The voltage V(x,t) subsequent to the grounding can be thought of as being made up
of a steady-state 1 voltage Vs(x) and a transient voltage V r(z,0 which decreases rapidly
with time. Thus,
VO M) - V s (x) + Vr(x,t). (28-8)
Since Vg(x) is linear, its value is given by the boundary conditions as
V s (x) » 12 ~ 12 j- (28-9)
Equations (28-6) and (28-7) now yield
, W) - V T (l,Q - 0, Vrixfl) - j-
1 Compare Sec, 9, Example 2.
517
SBC. 29] ELLIPTIC, PARABOLIC, AND HYPERBOLIC EQUATIONS
Since Vr satisfies (28-5), we can use the solution of the heat equation (9-10) with o? m
1/i RC, The result is
Vrfat) m 2 ( 7 f 7 xi sin dxi) e“0/«c?K»ir//)*< gja 22,
n»l \t JO l If l
The function V is now given by (28-8).
PROBLEMS
1. By using (28-1) with L *® 0, find / in the Example.
2. Find the emf in the cable whose length is 100 miles and whose characteristics are
as follows: R ** 0.3 ohm per mile, C * 0.08 pi per mile, L * 0, G *> 0. If the voltage
at the source is 6 volts and at the terminal end 2 volts, what is the voltage after the
terminal end has been suddenly grounded? [Use (28-5). j
3. Using (28-5), find the current in a cable 1,000 miles long, whose potential at the
source, under steady-state conditions, is 1,200 volts and at the terminal end is 1,100
volts. What is the current in the cable after the terminal end lias been sud-
denly grounded? Use R *» 2 ohms per mile and C ** 3’10~ 7 farad per mile.
29. Characteristics and Canonical Form. If a, b, c are continuous func-
tions of x and y, with a ^ 0, then the partial differential equation
au zx + 2 bu xy + cum - H{i,y,u,u x ,u v ) (29-1)
can be simplified by use of the equation
a dy 2 — 2b dy dx + c dx 2 = 0. (29-2)
Setting dy == p dx in (29-2) and solving the resulting quadratic give
p = a - " 1 ^ + (k 2 — ac)**] or p - a~ l [b — (6 2 — ac) H ], (29-3)
Since p =» dy/dx , Eqs. (29-3) are ordinary differential equations of the
first order, and hence the solutions may be expected to contain an arbitrary
constant c. If the solutions are written in the form
X (x,y) = c or Y(x,y) « c, (29-4)
the resulting curves (29-4) are called the characteristics of (29-1).
For example, when (29-1) is the wave equation
a 2 Uxz — u tt « 0 (29-5)
the differential equation (29-2) is
o 2 dt 2 — dx 2 « 0.
Since this reduces to dx/dt «* ±a, the characteristics are the straight lines
x — al ** c t x + at c.
It was shown in Sec. 1 that the change of variable
r m % — at, $ * x 4* at, u(x,t) « U(r,s)
518 PARTIAL DIFFERENTIAL EQUATIONS [CHAP. 6
reduces (29-5) to the form U r § ** 0, and a physical interpretation of the characteristics
was given in Sec. 4.
Equation (29-1) is said to be in canonical form if it has one of the three
forms
Uxz “h %yy ^ Ujj * H, ttjy 388 H
where H is a function of x, y , u, u x , and u y . It is a basic fact that the
reduction to canonical form can be achieved by means of the characteristics,
and we shall now describe 1 the procedure.
Case I. Elliptic Equation. When b 2 — ac < 0, the two values of p
in (29-3) are conjugate complex, and hence the same is true of X and Y
in (29-4). That is,
X « r{x,y) + is(x,y), Y » r(x,y) - is(x,y)
where r and » are real. In this case the reduction can be achieved by choos-
ing r and s as new independent variables. If u(x,y) - l/(r,s), Eq. (29-1)
gives an equation for U in which the second derivatives occur as U rr +
U
Case II. Parabolic Equation. When b 2 — ac = 0 the two values of
p in (29-3) are real and equal. Hence the same is true of X and Y in
(29-4). In this case the reduction can be achieved by the change of vari-
able
r ** X{x,y), s « any function independent of X .
The second derivatives of U now occur as U et .
Case III. Hyperbolic Equation. When b 2 — ac > 0, the roots (29-3)
are real and unequal, and the same is true of X and F. The reduction is
achieved by taking
r « X(x,y), s « Y(x,y)
as new independent variables. The second derivatives of U occur only
as U r9 .
To illustrate the procedure we shall consider the equation
Uxx *“*’ kxilgy *f 4 X^Uyy ** 0 (29**6)
when k — 0, 4, or 5. According to (29-3),
p » -j^kx ± - 4x*) H . (29-7)
When k — 0, this gives p « ±2 ix. The equations y‘ - db2 ix have the solutions
y — ix* » c, y +ix* ** c
1 A proof may be found in A. G. Webster, “Paitial Differential Equations of Mathe-
matical Physics/ 1 p. J242, Teubner Verlagsgesellschaft, Leipzig, 1927.
519
SBC. 80] ELLIPTIC, PARABOLIC, AND HYPERBOLIC EQUATIONS
where c is constant. Taking real and imaginary parte,
r « V, * - **.
With u(x,y) m U(r t e) t the derivatives are
« 4x 2 f/g, -f 2U 9f Uxy « 2 xUtr, Uw - Urr
and substitution into (29-6) with k -* 0 gives the canonical form
Urr + U» - ~(2x*)~ l U 9 m ~(2
When k » 4, the two roots (29-7) are both p * — 2x. Solving this differential equa-
tion we see that (29-4) is
y + i 2 * c, j/+r 2 »c.
Since y -f a; 2 and y are independent, we can take
r « y *f **, * « y.
It is left for the reader to show that the canonical form is
U» « — (2x 2 )“" 1 < r / r - — (2r - 2s)~ 1 £/ r .
Finally, the case & « 5 leads to two distinct real roots p
p «* dy/dx and solving,
y + “ c, y + 2z 2 ™ c.
The change of variable
+ M* 2 , s * y + 2x 8
now leads to the canonical form
U„ * (68 ~ fir)~\U r + 4C/.).
-*» P
(29-8)
— 4x. Setting
(29-9;
PROBLEMS
1. Derive (29-8) and (29-9).
2. Describe the behavior of the characteristics of (28-3) as LC varies from asero to
infinity.
8. Reduce to canonical form
3tt»y « U M -F 2uyy t 2u xy » U tX + Uyy, 2Ugy - U** + 2^.
30. Characteristics and Discontinuities. The function u * f(x — a<)
represents a wave propagating in the positive x direction with velocity a.
If f{x) has the form shown 1 in Fig. 43, the motion exhibits a wave front
(Fig. 44), whose locus can be found by setting the argument off equal to c;
x — at «* c.
In the xt plane, we recognize that this equation describes a characteristic
of the wave equation (29-5).
» The intent is f(x) - 0 for x > c, /'(c) - 0, /"(c+) - /"(«-) * 0.
PARTIAL DIFFERENTIAL EQUATIONS
520
(chap. 0
Discontinuities of the type just considered arise in many investigations,
ranging from the theory of the cracking of glass to the theory of super-
sonic flight. The locus of the discontinuity is always a characteristic, as
we shall presently see, and hence, the foregoing example is typical of the
general case.
c *
Fia. 43
Since the locus is a characteristic, a discontinuity of the type in question
may arise on two families of curves for hyperbolic equations, it may arise
on one family for parabolic equations, and it cannot arise for elliptic equa-
tions. For example, the equations of fluid flow are elliptic at velocities
less than the velocity of sound in the fluid. But at velocities exceeding
the velocity of sound the equations become hyperbolic, and the fact that
a discontinuity is now possible permits the formation of a shock wave.
To discuss these questions mathematically, consider a solution surface
u «= u{x t y) satisfying (29-1). We suppose that u is continuous and has
continuous first derivatives but has a discontinuity in one of the second
derivatives on a certain curve C. This single solution is to be regarded as
two solutions u 2 (^,y) which are tangent along the curve C but
do not have equal second derivatives along C. We let the surfaces de-
fined by U{ and u 2 extend past C, so that their first derivatives are well
defined on C.
The symbol ( ) denotes the jump of the function in the parentheses;
that is,
(w) = u\ ~ u 2 evaluated on C. (30-1)
Differentiating (30-1) with respect to x gives
(u x ) ~ U U - u 2x * (u) x (30-2)
and similarly for other derivatives. Our hypothesis is that
(w) — ( U x ) *= ( Uy ) — 0
(30-3)
but that one of the quantities ( u xx ), (u zy ) t or (u yy ) is not zero.
Differentiating the relation ( u x ) »* 0 with respect to x by the chain
rule yields
(uxx) + (u xy )y' - 0 (30-4)
when we recall that y is a function of x on C and take due account of
(30-2). Similarly, differentiating {u y ) = 0 with respect to x yields
(uyx) + (u yy )y' « 0.
(30-5)
521
SEC. 30] ELLIPTIC, PARABOLIC, AND HYPERBOLIC EQUATIONS
Taking the ( ) of the partial differential equation (29*1), we get
a{u x , r) + 2 b(u xy ) + c{uyy) * (if) * 0 (30-6)
when if is continuous. The fact that (H) « 0 follows from (30-3) if we
recall that H does not involve the higher derivatives of u.
Equations (30-4), (30-5), and (30-0) are three linear homogeneous
equations in the three unknowns (u xx ), (u* v ) = (u yz ) , and (u vv ). By
hypothesis not all these unknowns are zero, and hence, the coefficient
determinant must vanish:
1 y' 0
0 1 y'
a 2b c
0 .
(30-7)
Expansion of this determinant yields the characteristic equation (29-2),
so that C must be a characteristic curve. Conversely, if C is a character-
istic, the determinant (30-7) is zero, the related homogeneous equations
have a nontrivial solution, and a discontinuity is possible.
Example. Fundamental Solutions. A fundamental solution of a partial differential
equation is a solution of the form /[X(x,y)] f when X is a fixed function and /an arbitrary
function. For example, the equation u zx « Uyy has the fundamental solutions
/i(z - y) and / 2 (x + y) t30-8)
in which X « x — y and X =* x + y, respectively. We shall now see that if (29-1) has
the fundamental solution f[\(x,y)], then the curves
X(x,y) » c, c const (30-9)
are characteristics. For proof, it suffices to choose the arbitrary function / so that f"{x)
is continuous except at i * f. Then the function u = /[X(x,j/)] has a discontinuity of
the type previously considered on the locus X(x,?/) = c, and the desired result follows.
This explains why the techniques used in Secs. I to 6 to study the wave equation are
not applicable to Laplace’s equation. Namely, d’Alembert’s method is based on the
fundamental solutions (30-8) and the Laplace equation, lieing elliptic, has no such
solutions.
CHAPTER 7
COMPLEX VARIABLE
f
Analytic Aspects
1. Complex Numbers 527
2. Functions of a Complex Variable 534
3. Elementary Complex Functions 535
4. Analytic Functions of a Complex Variable 540
5. Integration of Complex Functions. Cauchy’s Integral Theorem 545
6. Cauchy’s Integral Theorem for Multiply Connected Regions 548
7. The Fundamental Theorem of Integral Calculus 550
8. Cauchy’s Integral Formula 555
9. Harmonic Functions 559
10. Taylor’s Series 501
11. Laurent’s Expansion 555
12. Singular Points. Residues 509
13. Residue Theorem 573
14. Behavior of f(z) at Poles and Essential Singular Points 574
Geometric Aspects
15. Geometric Representation 575
16. Functions w — z n and z — y/w 577
17. The Functions w = c * and z = log w 581
18. Conformal Maps 583
Applications
19. Steady Flow of Ideal Fluids 587
20. The Method of Conjugate Functions 588
21. The Problem of Diriehlet 595
22. Evaluation of Real Integrals by the Residue Theorem 599
525
f
This chapter contains a concise presentation of the rudiments of com-
plex-variable theory with an indication of its many uses in the solution of
important problems of physics and engineering. This theory, with roots
in potential theory and hydrodynamics, is among the most fertile and
beautiful of mathematical creations. Its unfolding left a deep imprint on
the whole of mathematics and on several branches of mathematical physics.
To an applied mathematician this theory is a veritable mine of effective
tools for the solution of important problems in heat conduction, elasticity,
hydrodynamics, and the flow of electric currents.
ANALYTIC ASPECTS
1. Complex Numbers. The analysis in the preceding chapters was
concerned principally with functions of real variables, that is, such vari-
ables as can be represented graphically by points on a number axis, say
the x axis of the cartesian coordinate system. The reader is familiar with
the tact that calculation of the zeros of the function f(x) = ax 2 4 bx 4 c,
when the discriminant b 2 — 4oc is negative, necessitates the introduction
of complex numbers of the form u 4 iv,
where u and v are real numbers and i is a
number such that i 2 = — 1.
A number of the form u + iv can be repre-
sented by a point in a plane referred to a pair
of orthogonal x and y axes if it is agreed that
the number u represents the abscissa and v
the ordinate of the point (Fig. 1). No con-
fusion is likely to arise if the point ( u,v ), asso-
ciated with the number u 4 iv, is labeled
simply u + iv. It is clear that the point (u,v) can be located by the
terminus of a vector z whose origin is at the origin 0 of the coordinate
system. In this manner a one-to-one correspondence is established
between the totality of vectors in the xy plane and the complex numbers.
527
m
COMPLEX VARIABLE
[CHAP. 7
The vector z may be thought to represent the resultant of two vectors,
one of which is of magnitude u and directed along the x axis and the other
of magnitude v and directed along the y axis. Thus,
z * u + iv,
where u is spoken of as the real part of the complex number z and v as
the imaginary part. Therefore, if the points of the plane are referred to
a pair of coordinate axes, one can establish a correspondence between the
pair of real numbers ( u,v ) and a single complex number u + iv , In this
case the xy plane is called the plane of a complex variable, the x axis is
called the real axis , and the y axis is called the imaginary axis .
If v vanishes, then
z « u + 0-t *= u
is a number corresponding to some point on the real axis. Accordingly,
this mode of representation of complex numbers (due to Gauss and Argand)
includes as a special case the usual way of representing real numbers on the
number axis.
The equality of two complex numbers,
a + ib = c + id,
is interpreted to be equivalent to the two equations
a = c and b = d.
In particular, a + ib » 0 is true if, and only if, a « 0 and b « 0.
If the polar coordinates of the point ( u,v ) (Fig. 1) are (r,0), then
u «= r cos 0 and v =» r sin 0
so that
r » Vm 2 + v 2 and 6 * tan"” 1 -•
u
The number r is called the modulus , or absolute value , and 6 is called the
argument, or phase angle , of the complex number z * u + iv. It is clear
that the argument of a complex number is not unique, and if one writes
it as 6 + 2 kr, where 0 < 6 < 2r and k = 0, =4=1, db2, . . ., then 6 is called
the principal argument of z. The modulus of the complex number z is
frequently denoted by using absolute-value signs, so that
r = | z | « | u + iv | * Vy 2 + v 2 ,
and the argument $ is denoted by the symbol
6 * arg z .
The student is assumed to be familiar with the fundamental algebraic
operations on complex numbers, and these will not be entered upon in
t
BBC. ij ANAXmC ASPECTS 529
detail here. It should be recalled that (cf . Chap. 2, Sec. 15)
zi -f *2 =■ (xi + iyi) + (x s + iy») = (*i + x 2 ) + i(.Vi + Va),
Z\'Z 2 (*1 + m )( x 2 + rn) = (Xix 2 - y x y 2 ) 4- i{x x ya + x 2 y x ),
zi x, + iy x xix a + y x y 2 . x 2 y t - x x y 2
— as — ■ ass — — - »■ ■■■■■» ■ «*— i
z 2 X 2 + iy 2 x\ + vi *2 + 2/2
provided that \z 2 \ * v erf + y\ 0.
On representing complex numbers z x and z 2 by vectors, we can see at
once from Fig. 2 that they obey the familiar “parallelogram law of addition”
formulated in Chap. 4.
From elementary geometric considerations we deduce that
I z l + z 2 | < | Z\ I + | Z 2 I ; (1-1)
that is, the modulus of the sum of two complex numbers is less than or equal
to the sum of the moduli. This follows at once from Fig. 2 on recalling that
the sum of two sides of a triangle is not less than the third side.
Also,
k + z2l>M~M; d-2)
that is, the modulus of the sum is greater than or equal to the difference of the
moduli. This follows from the fact that the length of one side of a triangle
is not less than the difference of two other sides.
Equations (1-1) and (1-2) yield a useful inequality,
|zi|-|*2i<|*i -*2|<|*i| + l*a|, (1-3)
indicated in Fig. 3.
When calculations are carried out with complex numbers, the notion of
the conjugate complex number is useful. We define the conjugate 2 of the
number z *= x + iy by the formula
2 *» x — iy.
530 COMPLEX VARIABLE (CHAP. 7
The application of the rules for addition, multiplication, and division
of complex numbers yields the following theorems:
(' a ) «i + 2 2 « li + l 2y (1-4)
or, in words, the conjugate of the sum of two complex numbers is equal to the
mm of the conjugates;
(*>) = hh, (1-6)
that is, the conjugate of the product is equal to the product of the conjugates;
or the con jugate of the quotient is equal to the quotient of the conjugates .
We note that if l = z, then z is real.
The geometric interpretation of multiplication and division of complex
numbers follows readily from polar representation of complex numbers.
Thus,
z x z 2 «* ri(cos 0i + ism 0i)r 2 (cos 0 2 + % sin 0 2 )
=* rir 2 [cos (i 0\ + 0 2 ) + i sin (0i + 02)1- (1-7)
That is, the modulus of the product is equal to the product of the moduli and
the argument of the product is equal to the sum of the arguments .
Also,
Z 1
ri(cos0i + i sin 00
r 2 ( cos 0 2 + i sin 0 2 )
r x
= — [cos (0i — 0 2 ) + i sin (0i — 0 2 )],
r 2
(1-8)
as follows on multiplying the numerator and denominator in (1-8) by cos 0 2
— i sin 0 2 . Thus, the modulus of the quotient is the quotient of the moduli
and the argument of the quotient is obtained by subtracting the argument of the
denominator from that of the numerator.
On extending formula (1-7) to the product of n complex numbers
we get
** « r*(cos 0k + i sin 0*), k - 1, 2, . . . , n,
* r t r 2 . . .r n [cos (0 X + 0 2 H b 0 n ) + i sin (0 X + $ 2 H b 0n)]
and, in particular, if all za are equal,
z n « [r(cos 0 + i sin 0)] n « r n (cos n$ + i sin n0). (1-9)
Formula (1-9) is known as the de Moivre formula, and we have shown that
it is valid for any positive integer n. We can show that it is also valid
for negative and fractional values of n.
631
SBC. 1] ANALYTIC ASPECTS
Indeed, from (1-8) we deduce that
1 cos 0 -f i sin 0 1
- * — : * ~ [cos (-$) + i sin (-6)],
z r(cos 0 + i sm 0) r
and since (1-9) is known to hold for positive integers n,
a z~ n » [cos (— 0) + i sin (—0)]*
« r^[cos ( — n$) 4* tsin ( — n0)].
This establishes the result (1-9) for negative integers n.
To prove the validity of (1-9) for fractional values of n, it suffices to
show that it holds when the integer n is replaced by 1/n, for on raising
the result to an integral power m, we obtain the desired formula for frac-
tional exponents.
Let
w m z lln m tyz 9
so that w is a solution of equation
w n ® z .
On introducing polar representations,
w = R(cos (fi + i sin p),
z « r(cos 6 + i sin 0),
( 1 - 10 )
(Ml)
( 1 - 12 )
where 0 is the principal argument of z, we can write (1-11) with the aid
of (1-9) as
w n «* R n (cos n<p + i sin n<p) * r(cos 0 + i sin 0 ).
We conclude from this that
R n a* r, n<p =» 0 dfc 2&7r, A; *= 0, 1, 2, . . ♦ ,
and thus
Hence, from (1-12),
n/“ 0 d: 2 /ct
J { * vr, ^ — » fc *c 0, 1, 2, • . ..
(
n
0 dz 2 fcir
r 1 cos ■
+ i sin ■
0 ± 2kr
: 2kr\
—)•
and on recalling (1-10), we see that
, , , , . , / e ± 2fcir . e ± 2kx\
2 i/» „ [ r ( C08 e + i sin 0)] 1/n ■> r l/n ( cos (- i sm }•
\ n n /
(1-13)
COMPLEX VARIABLE
[CHAP. 7
Since cob (6 ± 2kr)/n and sin (9 =b 2kr)/n have the same values for
two integers k differing by a multiple of n, the formula (1-13) yields just
n distinct values for \^z, namely,
Vi
r i/»
8 + 2kr
n
+ tsin
6 + 2 kr\
k «■ 0, 1, 2, . . n — 1.
(1-14)
The validity of formula (1-9) for fractional values of n follows directly
from (1-14) upon raising z lln to an integral power m.
We illustrate the use of formula (1-14) by two examples.
Example 1. Compute In this case z * 1 and its principal argument 0*0.
Formula (1-14) then yields
nr~ 2 kx . , 2kx
V 1 « cos h i sm > k » 0, 1, ...,« — 1.
n n
If we plot these « roots of unity, we see that they coincide with the vertices of a regular
polygon of n sides inscribed in the unit circle, with one vertex of the polygon at. z *• 1.
Figure 4 shows this for n *» 6.
Example 2. Find all roots of ' + Since 1 + i * y/2 [cos (t/4) + % sin (t/ 4)],
formula (1-14) gives
»/• «/~ / (tt/4) 4* 21ct , , (r/4) -f~ 2kx\
Vi + f * V2 ^cos - j - 1 sm » k «■ 0, 1, 2.
Thus the desired roots are
wi « ^5(cos K 2 * -f i sin H 2 *-),
1 Z >2 « V^(co8 -f* i sin K*),
wz - -^(cos 4* t sin 1 K2»)-
These roots are represented in Fig. 5.
I?ia. 4
Fig. 5
ANALYTIC ASPECTS
SBC. 1]
The reader unskilled in simple calculations involving complex numbers
is urged to work out the representative problems in the following list before
proceeding to the next section. The symbols Re (z) and Im (z) used in
some problems in this list denote, respectively, the real and imaginary
parts of a complex number z.
PROBLEMS
1. Find the moduli and principal arguments of the following numbers, and represent
the numbers graphically:
(a) 1 + tV 3, (b) 2 + 2i, <c) -2, (d) i\ (e)
1
if)
1 + i
w a - i) 4 .
2. Write the following complex numbers in the form a + bi:
1 + i' 1 -V (t- V5) 8 '
is)
(a) (1 - V3 0* (6)
(1 + i) t
(c) ■
V 3 *
1 - i ' v 1 4- i
3. Find the cubes of the following numbers:
(a) 1, W Vii-\ + »V 3), (c) Mi-\ - t\/3).
4 . Find the cube roots of t, and represent them graphically.
6. Find all solutions of the equation z 4 4~ 1 *0.
6. Verify that z 2 — 2z 4* 2 » 0 has the roots z « 1 ±t.
7 . Compute and represent graphically the following numbers:
(a) </l, (b) W, (r) J/i, (d)
8. Find all the fifth roots of 1 -f- i, and represent them graphically.
9. Use de Moivre’s formula |r(cos0 + 7 sin 0)| n = r n (cos nO -f i sin nd ) to o K oain
cos 20 « cos 2 0 — sin 2 0 and sin 26 « 2 sin 0 cos 0.
10. Write the following numbers in the form a - \~bi :
(a) Vi, (6) Vi - i, (c) ■ 1
Vl + * '
11. Prove that ( a ) z x 4- z 2 ** 22 4* *1, (6) Z1Z2 ® z&\ } (c) z x (z% 4* 23)
12 . Show that if ziZ2 *= 0, then t\ * 0 or z 2 * 0,
13 . Prove formulas (1-4), (1-6), and (1-6).
14 . Find |z|, Z, Re (z), and Im (z) for the following:
(a) z - 1 — 2 f; (6) * - 3 + 4i; (c) r - h — .
3 + 4t
16. Show that (a) tz » — iz, (6) |?| ® |z| 3 .
16 . What is the locus of points for w^hich
(a) j z | - 1? (b) \z | < IV (e)|#|>17
flint : \z\ «“ y/ a* 2 + y 2 .
17 . If z » x 4“ iy t what is the locus of points for which
(a) Re (z) > 1? (6) Im (z) > l? (c) Re (z 2 ) « 1?
18 . If z » x 4“ W, describe the loci:
1 I z — 1 I
(a) \z — 1 1 -* 2; (6) — * const; (c) — — - const,
|Z| I 3 4* 1 I
19 . Under what conditions does one have the relation
(a) \z\ -h h\ m |*il 4"|*al? (&) |*i 4- h \ «* !*i| -
20. If * « x 4* UA write the following in the form u 4* tv:
<«} ^ ** +* - 1. w -Lj.
Z i — Z Z T’ l
*1*5 +
534
COMPLEX VARIABLE
[CHAP. 7
2. Functions of a Complex Variable. A complex quantity z -* x + iy
in which x and y are real variables is called a complex variable . We shall
speak of the plane in which the variable z is represented as the z plane.
If in some region of this plane for each z ~ x + iy one or more complex
numbers it? «* u + tv are determined, we say that tt? is a function of z
and write
u? » u + it? « f(z).
Thus, w « x 2 — y 2 + i2 xy « (x + iy) 2 « z 2
is a function of z defined throughout the z plane. Also,
u? « u -f iv ** x — iy « 2
is a function of z. In fact, every expression of the form w(x,y) + iv(x,y)
in which u and t> are real functions of x and y is a function of z, since x as
J^(z + 2) and y as (1/2 i)(z — z) are functions of z.
A complex function w = f(z) is single-valued if for each z in a given region
of the z plane there is determined only one value of w. If more than one
value of w corresponds to 2 , the function w = f(z) is multiple-valued. Thus
w ~ z 2 ~ y 2 + i2xy = z 2
and w ~ x 2 — y 2 — i2xy « g 2
are single-valued functions of z. The function w = y/z for each z^O
determines two complex numbers, for on setting z = r( cos 0 + i sin 6)
and recalling formula (1-14), we get
so that
It? SC
W\ **
u?2 “
(■
0 + 2fcjr 0 +
cos f- i sin —
)■
( 9 . . 9 \
I cos - + 1 sin - I <
V 2 2/
jj/os (~ + -f t sin (HI-
* - 0 , 1 ,
Thus w = y/z is not single-valued.
The functions in the foregoing examples are defined throughout the
z plane. The function w =* l/z is not defined at the origin z » 0, while
10 « l/( J 2 j — 1) is not defined when \z\ « 1, that is, when the points 2
lie on the circle of radius 1 with center at the origin.
Of course, it? = f(z) may be defined by different formulas in different
regions of the plane, or it may not be defined at all in certain regions.
In dealing with regions of the z plane we shall distinguish interior points
from those that lie on the boundaries of the region. A characteristic prop-
erty of the interior points is that about each interior point P one can draw
a circle with center at P and with nonzero radius r so small that the circle
ANALYTIC ASPECTS
SBC. 3]
535
contains only those points that belong to the region. The points on the
boundary of the region are not interior because every circle with the
boundary point as its center includes points that do not belong to the
region.
A region consisting only of interior points is said to be open . An ex-
ample of such a region is the circular region whose points z satisfy the
condition \z\< R. When the boundary of the region is included in the
region, the region is called dosed. An example of a closed region is the
region consisting of points z such that | z | < R.
If every point of the region is at a finite distance from the origin, the
region is said to be finite or bounded . Thus all points of the bounded region
lie within a circle |z| - R if the radius R is chosen sufficiently large. The
region consisting of all points in the z plane is unbounded , and so is the
region consisting of the points satisfying the condition \z \ > 1.
A plane region is simply connected if every closed curve drawn in the
region encloses only points of the region. Thus, a region bounded by an
ellipse is a simply connected region, while a region bounded by a pair of
concentric circles is not simply connected. A region that is not simply
connected is called multiply connected .
PROBLEMS
1. Express the following functions in the form u(x,y) + iv(x,y ):
(a) Z* - * + 1, (f>) (r) (d) 0) ,(/)* + M(z - i), ig ) -i,
Z ~t l
(h)
z 2 — 2z + 1
1
z + i
2
«**
z -f* 2
2. Describe the regions in the z plane defined by the following condition*:
(a) Re (z) <3; (6) Im(z)>l;(c) \z\>l](d)\ <|z|<2;« |*-1|<1;</) \z ~
< 1 ; (fir) |* +i|> 2.
3. Elementary Complex Functions. In Sec. 1 we defined the operations
of addition, multiplication, division, and root extraction for complex
numbers. These suffice to determine, for any z, values of such algebraic
expressions as
Oo z m + Oi2 m_1 H ho
w ~
boz n + biz n 1 . . . -(- b
in which the powers m and n may be integers or fractions. However, they
do not provide direct means for defining the complex counterparts of the
real elementary transcendental 1 functions e x , sin x, log x, tan~ l x, etc.
1 A variable w satisfying the equation P(z y w) ** 0, where P is a polynomial in z and
u\ is called an algebraic function of z. A function that is not algebraic is called transcend
dental. The trigonometric and logarithmic functions and their inverses are called ele-
mentary transcendental function #.
COMPLEX VARIABLE
[CHAP. 7
A useful definition of a complex function such as e *, for example, must
specialise to e* when z assumes real values. Also, it is desirable to pre-
serve the familiar law of exponents e**e** * e*i+\
A definitive formula for e* that fulfills these criteria is
e* » e* +iv •» e*(cos y + i sin y). (3-1)
Moreover, as we shall presently see, it suggests sensible definitions for
all the other elementary transcendental functions. We note first that for
*»0 the definition (3-1) yields
e %v « cos y 4* i sin y. (3-2)
On replacing y by ~y we get
e~~ tv * cos y — i sin y .
Adding and subtracting (3-2) and (3-3) we get the Euler formulas
cos y « Yt{e iv + e m ~ iy ) f
1
sin y » — ; (e %v — e Hf ).
(3-3)
(3-4)
These formulas suggest that we define the trigonometric functions of z
as follows:
cos z « - (e'* + e '*),
2
sm 2
sin 2
tan 2 « »
cos 2
1 1 1
cot 2 *» sec 2 a • CSC z **
tan 2 cos 2 sin 2
(3-5)
Using these definitions it is easy to check 1 that all the familiar formulas
of analytic trigonometry remain valid when real arguments are replaced
by the complex ones. For example,
sin 2 z + cos 2 2 = 1,
sin (21 + z%) * sin zi cos z% + cos t\ sin z 2f
and so on.
The logarithm of a complex number 2 is defined in the same way as in
real variable analysis. Thus,
means that
w ** log 2
2 « e** y
(3-6)
(3-7)
* See Prob. 1 at the end of this section. Also, of. alternative definitions of **, «n s*
and cos * in Sec. 17, Chap. 2.
BEC. 3] ANALYTIC ASPECTS 537
where e is the base of natural logarithms. Setting w «* u + w in (3*7)
gives
z » e w+ ^ « e*(cos + t sin v) (3*8)
by (3-1). On the other hand, we can write z as
z « x + iy *» r(cos 0 + t sin 0),
so that (3-8) gives
r(cos 0 + i sin 0) « e“(cos i; + t sin v).
It follows from this that
e u « r, v * 0 + 2fcir, fc * 0, dbl, db2, . . .. (3-9)
Since u and v are real, we conclude from (3-9) that u * Log r, where the
symbol Log is used to denote the logarithm encountered in real-variable
theory. We can thus write (3-6) in the form
w * u + iv ®* log z *= Log r + (0 + 2kr)i
(3-10)
or
1 y
log z = - Log (x 2 + y 2 ) + i tan"” 1 ->
2 x
(3-11)
since r = Vj 2 + y 2 and 0 + 2kir = tan"" 1 (yfx).
Thus log 2 has infinitely many values corresponding to the different
choices of the arguments 0 of z. Setting k ~ 0 in (3-10) and assuming
that 0 < 6 < 2w, we get a single- valued function
log z * Log r + 6i f 0 < 6 < 2ir,
which is called the principal value of log z. If z is real and positive, the
principal value of log z equals Log r.
The definition (3-10) serves to define complex and irrational powers c
of the variable z by the formula
z e * e c log * (3-12)
which is equivalent to the statement that log z c ~ c log z. Inasmuch as
log z is infinitely-many- valued, it follows that z c ) in general, is an infinitely-
many- valued function. 1 The hyperbolic functions of z are defined by the
formulas
1 1 _ sinh z
sinh z = - (e* — e *), cosh z * - (e f + e *), tanh z » — — »
2 2 cosh z
sechz
1
coshz
csch z
1
sinh z
(3-13)
1 Note, however, that t e is single- valued when c is an integer.
COMPLEX VARIABLE
[CHAP. 7
These functions are clearly single-valued. The inverse trigonometric and
inverse hyperbolic functions are defined in the same way as in real-vari-
able analysis, and they are multiple- valued . 1
Example 1. Compute e l ~'*.
On setting * - 1 and y * —1 in the formula (3-1) we get
a 1 -* » e[cos ( — 1) +ia in ( — 1))
— e(cos 1 — i sin 1).
Since cos 1 - 0.54030, sin 1 « 0.84147, and e - 2.718,
e w - 2.718(0.5403 ~ i’0.8415)
- 1.469 - t‘2.287
to three decimal places.
Example 2. Compute sin (1 — i).
Since
sin z ~ ~ ( e” - e~“)
and * *» 1 — i, we have
mn(l — i) - i (« <+1 - «— *)
£X
2i
e — t
1 -f % sin 1) — e~ l [cos ( — 1) 4- i sin ( — 1)1 1
e 4* e“ l
2 i
- cos 1 -f -
2
- sin 1.
We can obtain the same result by making use of the addition formulas of trigonometry.
Thus,
sin (1 — i) **» sin 1 cos (*~t) 4- cos 1 sin (— i)
But by (3-5)
» sin 1 cos i — cos 1 sin i.
cost ~ i (e"* 1 4- c 1 ), sin i « ~~ (e~ l — e l ).
2 2 %
Substitution In the foregoing formula yields the result obtained from the definition of
sins.
Example 3. Compute log (1 + t).
Since 1 4- » ■ \/2 [cos (ir/4) 4- i sin (t/4)],
log (1 + *) - Log V2 + ( J + 2kr ^ i, fc - 0, ±1, ±2
by (3-10). The principal value is got by setting k • 0.
Example 4. Compute 2*.
By (3-12),
2 * - e iU *\
1 See Probe. 7 and 8.
m
SBC. 3} ANALYTIC ASPECTS
But log 2 m Log 2 4- i2itk. Hence
2 i - «**•*-** k - 0, rfcl, 4=2, . . ..
Example 5. Compute t*.
By (3-12),
t* - «<**•*.
But log i «■ Log 1 4- *[(*72) + 2/cr] « *K*V2) 4* 2 Jct], and hence
{i „ c -<*/2-H*iO * - 0, dfcl, sfc2, . . ..
Example 6. Find all solutions of the equation cos * — 2 *■ 0.
We have cos a ** 2, which gives, successively,
e “ + e_ “„2
2 2 '
«“ + «-** - 4,
«“* - 4«“ + 1 - 0.
Solving for e u .
. 4rfcVl6-4
2
a 2i x/3.
Hence
iz "* log (2 ± \/3 )
and
a « \ log (2 rfc \/3 ).
t
Since log (2 db \/5 ) is infinitely-many-valued, there are infinitely many values of z.
PROBLEMS
1. Verify the following: (a) e ** ( b ) sin 2 a 4- cos 2 a — 1; (c) cos (aj 4- aj)
* cos z\ cos tt — sin Zi sin z^; ( d ) cos iz « cosh a; (e) sin iz » t sinh a.
2. If a and b are real integers, show that (re ie ) a +6t «* ^“^[cos (ad 4* 6 Log r) 4-
t sin (ad 4* b Log r)J.
3. Compute (a) oos (2 4* t), (b) 1*, (c) (1 4- 0\ (d) 2 1+ \ ( e )
4. Express in the form a 4- bi t where a and 6 are real: (a) l/(a — 1), (6) 1 /(«* 4. t),
(c) sin (1 4- 1), (d) e 1 *, (e) e l/ *.
6. Find the principal values and represent the numbers graphically: (a) log (—4),
(b) log (5i), (c) log (1 4- *), (d) log t, (e) i\ (/) 6 1+ \ (g) sin 2i.
6. Find all solutions of the following equations: (a) e* 4- 1 =* 0; (6) sin z — 2 «* 0 ;
(c) cos'" 1 a * 2; (d) cos z— 1 *■ 0.
7. The inverse functions are defined as solutions of the equation z * f(w) for w in terms
of 2. Thus, w ** sin” 1 z if a — sin 10 » (e tw ~ e~ iw )/2i. Obtain e iv> in this example
by solving the equation e 4< " — 2ize iw — 1 « 0. The Result is e tw ** iz db Vl — a*.
Hence w •» sin"" 1 a «■* — i log (iz 4= VT ~ z* ). Show in the same way that
tan 1 2 «• - log
2 % — a
and cos"* 1 % » — » log (« 4= \A* — 1 ).
540
[chap. 7
COMPLEX VARIABLE
*• Bofer to Prob. 7 and show that:
(«) sinh -1 * «• log (* + V** + 1 ), (6) cosh -1 * — log (* + V** — 1 ),
(c) tonh -1 1 m i log .
2 1 **■ t
9* For complex numbers a, b, c in what sense and in what circumstances is it true
that (ab) e - a e b e ?
4. Analytic Functions of a Complex Variable. We say that a point
z « x + iy approaches a fixed point z 0 * x 0 + iyo if x x 0 and y y 0 .
Let f(z) be a single-valued function defined in some neighborhood of the
point z «= Zq. By the neighborhood of Zq we mean the set of all points in a
sufficiently small circular region with center at z 0 . As z — > z 0 , the function
f(z) may tend to a definite value w 0 . We say, then, that the limit of f(z)
as z approaches z 0 is w 0 and write
lim f(z) m w 0 .
* > *0
In particular, if f(z 0 ) = w 0 , we say that f(z) is continuous at z * zq.
It is not difficult to prove that if f(z) — u(x,y) + iv(x,y) is continuous
at Zo « % + iyo, then its real and imaginary parts u and v are continuous
functions at (x 0 ,^o), and conversely.
Let w = f(z) be continuous at every point of some region in the z plane.
The complex quantities w and z can be represented on separate complex
planes, called the w and 2 planes. The relationship w = f(z) sets up a
correspondence between the points (x f y) in the z plane and the points
(u } v) in the w plane (see Figs. 6 and 7), so that the corresponding points
(u f v) fill some region R' in the w plane.
if zo « Xo + iy 0 and z * zo + Az are two points in the z plane with
* r ** + * Ay ’ the corres P° nd ^g Points in the w plane are w 0 » «o + iv 0
and v> - wo + Aw, where Aw m Aw + t Av. The change Aw in the
$41
SEC. 4} ANALYTIC ASPECTS
value of Wo <•* /(to) corresponding to the increment Az in to is
Ate * f(to + A z) - /(to)
and we define the derivative dw/dz [or/'(z)] by a familiar formula
/'(so)
Aw
lim —
At — • 0 A z
lim
At — ♦
/(z 0 + Az) - /(z 0 )
o Az
(^ 1 )
It is most important to note that in this formula z « Zo + Az can assume
any position in the neighborhood of z 0 and Az can approach aero along
any one of the infinitely many paths joining z with z 0 - Hence, if the
derivative /'(z<>) is to have a unique value, we must demand that the limit
in (4-1) be independent of the way in which Az is made to approach zero.
This restriction greatly narrows down the class of complex functions that
possess derivatives.
For example, if
w Zly
then on replacing z by z + Az and l by l + Az, we get
w + Aw » (z + Az) (z -f Az) «= zl + z Az + z Az + Az Az,
Hence Aw — zAz + zAz+AzAz
Aw Az —
and — - 2 + z b Az, (4-2)
Az Az
We show next that this quotient, in general, has no unique limit as Az
is made to approach zero along different paths. Since z «* x + iy,
Az = Ax 4* i Ay, Az » Ax — i Ay
and we can write (4-2) as
Aw Ax — i Ay
— « x — iy -f (x + iy) b Ax — i Ay . (4-3)
Az Ax + i Ay
If we now let Az in (4-3) approach zero along the path QRP (Fig. 8), so
that first QR ** Ay — » 0 and then PR « Ax 0, we get
Aw
lim — = 2x.
&»-+ o Az
But if we take the path QR'P and first allow QR' = Ax — » 0 and then
R'P » Ay — > 0, we obtain
Aw
lim — * —2 iy.
At -+ o Az
542
COMPLEX VARIABLE
[CHAP. 7
Except for x * y ■* 0, these limits are distinct, and hence w «• zl has no
derivative except possibly at z ® 0. As a matter of fact, it is possible to
show that this function does have a derivative (whose value is zero) only
at the point z * 0.
On the other hand, if we consider
w — z?
then w + Aw? « (z + A z) 2 * z 2 + 2z Az + (Az) 3 , so that
Aw 2 z Az + (Az) 2
— =. — 2 z + Az.
Az Az
The limit of this quotient as Az 0 is invariably 2z, whatever may be
the path along which Az — > 0. In this example the derivative exists and
its value is 2z.
We obtain next a set of conditions which real and imaginary parts of
w = f(z) as u(x,y) + iv(x,y)
must fulfill if f(z) is to have a unique derivative at a given point z ~ x + iy.
Since Aw *= Au + i Av and Az = Ax + i Ay, we get from (4-1)
/'(*)
Au + i Av
= lim
Az 0 Az
Au + i Av
= lim
ax o Ax + i Ay
Ay 0
(4-4)
Now, if we let Az 0 by first allowing Ay -* 0 and then Ax —* 0,
we get from (4-4)
(«)
t
SEC. 4] ANALYTIC ASPECTS
If, on the other hand, we compute the limit in (4*4) by making first Ax
and then Ay —► 0, we obtain
543
-4 0
(4-6)
Hence, if the derivatives in (4-5) and (4-6) are to have identical values at
a given point z for these two particular modes of approach of A z to aero,
we must have
du dv dv du
— - — * — (4-7)
dx dy dx dy
Equations (4-7) are known as the Cauchy-Riemann equations , and the
foregoing calculation shows that they constitute necessary conditions for
the existence of a unique derivative of f(z) = u(x y y) -f iv(x,y) at z =* x + ty.
These equations also turn out to be sufficient 1 if one f urther assumes the
continuity of partial derivatives in (4-7) at the point (x,y).
Complex functions which have derivatives only at isolated points in the
z plane are of minor interest in applications in comparison with those that
have derivatives throughout the neighborhood of the given point. We say
that a function f(z) is analytic (or holomorphic) at a given point z » z 0 if it
has a derivative f'(z) at z = Zq and at every point in the neighborhood of z 0 .
It can be shown that the following theorem 2 is true.
Theorem. A necessary and sufficient condition forf(z) = u(x,y) + iv(x $ y)
to be analytic at z 0 = Xo + Wo is that u(x } y) and v(x,y) together with their
partial derivatives be continuous and satisfy Eqs. (4-7) in the neighborhood
of (x 0 ,y 0 ).
The points of the region where f{z) ceases to be analytic are called
singular points of f{z).
It is easy to show that familiar rules for differentiating sums, products,
and quotients of real functions remain valid for analytic functions.* Also
the formulas for differentiating elementary complex functions, defined in
Sec. 3, are identical with the corresponding formulas in the calculus of
real variables. We give a derivation of several such formulas in the follow-
ing examples. 4
1 A demonstration of this is given in several standard texts. See, for example, E. C.
Titchmarsh, "The Theory of Functions/’ 2d ed., p. 68, Oxford University Press, London,
1939.
* This theorem can be deduced with the aid of the strong form of Cauchy’s theorem
stated in Sec. 5.
* See Prob. U
4 See also Prob. 2.
644
COMPLEX VARIABLE
(chap, 7
Example 1. Show that de*/dz m e*.
H xjo m e* a* 4*+% then the definition (3-1) yields
w « u + iv » e*(cos y -f * sin y).
Her®, t* ■■ e* cos y, t) * e* sin y, and it follows that
du **
— « er cos y,
dx
du
dy
— s* sin y,
— = rsmy,
* ,
— “ c cos y.
dy
Since Eqs. (4-7) are satisfied and the partial derivatives are continuous, dw/dz can be
calculated with the aid of either (4-5) or (4-6). Then,
dw
— « e* cos y -f ic* sin y
*> e*(cos y + i sin y) ** e*.
Example 2. Show that (d log z)/dz *» 1/z if z ^ 0.
The function w » log z, as noted in Sec. 3, is multiple- valued. However any branch
of this function got by fixing the value of k in (3-10) is single- valued, and the application
of Cauchy-Riemann equations (4-7) to it shows that it is an analytic function except at
z m 0. On fixing k we get from w * log z a single- valued function
z - e w
whose derivative with respect to w by Example 1 is
Hence
dw d log z 1
dz dz z
if t & 0.
The point z *■ 0 is a singular point of w « log z, since the derivative at that point
ceases to exist.
Example 3. Show that dz n /dz » nz n ~ l for all values of n (real or complex).
If w *• z w , then
log w m n log z.
On differentiating this with respect to z, we get
1 dw n
w dz z
.< dw w
Hence ~ • n - - nz*~\
at %
since w m z n . This derivative oeases to exist at * «* 0 if n <1.
SBC. 5]
ANALYTIC ASPECTS
545
PROBLEMS
1* Show that
(O) ~(fx =fc/ s ) -//(*) ±/&), W |(/iA) - A/2 +/j/J.
to
d*
d)
whenever /i and h are analytic functions.
2. Show that
(a)
(d)
d(cos z)
dz
<J(iaiT
«* — sin z t (b)
'z) 1
d(sin z)
dz
cos z, (c)
dfidh
df% dz 1
d(tan z)
dz
dz
1 Z‘
d(sinh z) /rs da*
- > ( € ) -- » cosh z , (/)
dz
dz
** sec 2 *,
** a* log a.
3. Determine where each of the following functions fails to be analytic: (a) z 2 -f- 2 z
(b) z/(z -f 1), M 1 /* + (z ~ l) 2 , (rf) tan z , (e) l/[(z - l)(z -f 1)],(/) 2 | > (^) £ ) *2 - J
- (i) x/(x 2 -f y 2 ) -f %/(* 2 + I/ 5 ), 0) |*|, W tan" 1 z ,
6. Integration of Complex Functions. Cauchy’s Integral Theorem. We
define the integral J c /( z ) dz of a complex function f{z) = u(x, 2 /) +
along a path C in terms of real line integrals as follows:
f f(z) dz s (u + zV)(dr + t dy)
= J c (u dx - V dy) + if c ( vdx + u dy). ( 5 - 1 )
Real integrals of this type were studied in Chap. 5, Sec. 4, where it was
observed that they exist when the functions u(x y y) and v(x,y) are continu-
ous and the path C is sufficiently smooth.
The integral in (5-1) can also be defined in a manner of Sec. 4, Chap. 5,
by the formula
r n
/_/(*) dz - lim 23 /(f.)(z. - z«-i). (5-2)
J{u n
xnxx\t i ~t i _ l \ -> 0
It is supposed that the curve C is divided into n segments by points z>-
and that f* is some point of the ith segment. The limit is then computed
as the number of segments is allowed to increase indefinitely in such a
way that the length of the largest segment tends to zero. The fact that
the definitions (5-1) and (5-2) are equivalent follows from consideration
of Sec. 4, Chap. 5.
As an illustration of the use of formula (5-1) consider the integral
54$ COMPLEX VARIABLE [CHAP. 7
where the path C is a straight line joining the points 2 = 0 and z = 1 *f
2 1 (Fig. 9). Since l 2 = (x - iy) 2 = x 2 - y 2 - i2zy, we get, on sub-
stituting u ~ x 2 — y 2 ,v =* —2xy in (5-1),
jf s 2 dz = ^ [(r 2 - y 2 ) dx + 2xy dy] + if c [ — 2xy dx + (a: 2 — y 2 ) dy].
(5-4)
But the cartesian equation of C is y = 2x, and hence (5-4) can be reduced
to the evaluation of two definite integrals:
jz 2 dz = 5x 2 dx + — 10x 2 dx = % — i x %.
The value of the integral (5-4) depends on the path C joining the given
points z = 0, 2 = 1 4- 2i, for according to Sec. 9, Chap. 5, a necessary and
sufficient condition that the line integral
f Mdx + Ndy
Jc
(5-5)
be independent of the path in a simply connected region R is that
dM _ dN
dy dz
(5-6)
throughout R. We further recall that in deducing the condition (5-6)
with the aid of Green’s theorem it was supposed that M(x,y), N(x f y),
and their partial derivatives in (5-6) are continuous functions throughout
the region. It is readily checked that Eq. (5-6) is not satisfied by the
functions appearing in the line integrals in (5-4).
If, however, f(z) = u + iv in (5-1) is an analytic function, then the
Oauehy-Riemann equations (4-7) demand that
du dv dv du
dx dy dx dy
(5-7) *
ANALYTIC ASPECTS
547
SEC. 5]
Reference to (5-6) shows that these conditions are precisely those that
ensure the independence of the path of the line integrals in (5-1), provided
that the partial derivatives in (5-7) are continuous functions in the given
simply connected region R. Thus, if we suppose that f(z) is analytic in the
given simply connected region and /'(z) is continuous there, then the
integral
f c f{z)dz
is independent of the path joining any pair of points in the region. If
the path C is closed, then the value of this integral is zero. We thus
have a theorem, first deduced by Cauchy, which is of cardinal importance
in the study of analytic functions. Although the foregoing proof assumes
the continuity of /'(z), the theorem can actually be established 1 under
the sole hypothesis that f(z) exists at each point of the region, and we
state it in this strong form.
Cauchy’s Integral Theorem. If f (z) is analytic at all points within
and on a closed curve C, then dz ~ 0.
We conclude this section by deducing, from definition (5-2), a useful
inequality furnishing an upper bound for the value of the complex integral
j c f( z ) dz. Inasmuch as the modulus of the sum of complex numbers is
never greater than the sum of the moduli,
I { c f(z)dz\< f c \f(z)\-\dz\.
Now, if the modulus |/(z)| of f(z) along C does not exceed in value some
positive number M f then
j dz j < M (dz| = M | dx + idy | = M j^ds - ML (5-8)
where L is the length of C .
As an illustration of the use of the inequality (5-8) we apply it to deduce
an upper bound for the integral (5-3) . The modulus of l 2 takes its maximum
at the point z = 1 + 2 i. Hence we can take M in (5-3) as ] I -f 2i\ 2 = 5,
and (5-8) then yields
| j z 2 dz J < 5^/5,
inasmuch as L ** V5 for the rectilinear path in (5-3).
1 See Titchmarsh, op. cit. $ pp. 75-83. In a somewhat different development of the
subject one deduces the continuity of f\z) from Cauchy’s theorem, not the other way
about. We shall see in Sec. 7 that the theorem actually implies existence and continuity
of derivatives of all orders.
548
COMPLEX VARIABLE
{CHAP. 7
PROBLEMS
1. Find the value of the integral f z 2 dz along the rectilinear path joining the points
Jc
* m 0 and * • 2 + t. Show that this integral is independent of the path.
1 Find the value of the integral l l dz along the rectilinear path y » x joining the
Jc
points (0,0) and (1,1) and also along the parabola y « x 2 joining the same points.
3. Show that the integral I z dz evaluated over the path | z | ** 1 in a counterclock-
Jc
wise direction yields 2iri. Note that z — e* and l ** along the path |*| « 1.
r l 2 — i
4. Find the value of the integral / dz , where the path is the upper half of the
J- i ^
circle |*| *■» 1. Calculate the value of this integral over the lower half of the circle
5. Show that / (1 -f * 2 ) ds is independent of the path C, and evaluate this integral
Jc
when C is the boundary of the square with vertices at the points « ■ 0,« » l,i ■ 1 + *,
and z * i.
6. What is the value of the integral I e % * dz where C is the boundary of the square in
Prob. 5? °
7. Find the value of the integral / e* dz over any path joining * «■ 0 and * - vi.
J o
8. Use formula (5-8) to show that:
1 j { r2+t rj, | | rt i
(a) J j z l dz J < 10, (6) j J “j j < 2, (c) J J (x 2 + *t/ 2 ) dz J < 2,
where paths are straight lines joining the points appearing in the limits of these integrals.
6. Cauchy’s Integral Theorem for Multiply Connected Regions. In
establishing Cauchy’s integral theorem in the preceding section we assumed
Fig* 10
that the region bounded by the
curve C is simply connected. It is
easy to extend this theorem to
multiply connected domains in the
maimer of Sec. 9, Chap. 5. Thus
consider, for definiteness, a doubly
j connected region (Fig. 10) bounded
/ by closed curves C\ and C 2 , where
' C 2 lies entirely within C x . We
assume that f(z) is analytic in the
region exterior to C 2 and interior to
Ci and analytic on C 2 and C x . The
requirement of analyticity on C\ and
C 2 implies that the function f(z) is
analytic in an extended region (indi-
cated by the dashed curves K x and K 2 ) that contains the curves Cj and C 2 .
If some point A erf the curve C x is joined to a point B of C 2 by a crosscut
SEC. 6] ANALYTIC ASPECTS 549
AB, then the region becomes simply connected and the theorem of Cauchy
is applicable. Integrating in the positive direction gives
<f APA /(*) dz + j Ag m dt + <fi BQB /(«) dz + j BA f(z) dz - 0 , (&- 1 )
where the subscripts on the integrals indicate the directions of integration
along Ci, the crosscut AB, and CV Since the second and the fourth
integrals in (6-1) are calculated over the same path in opposite directions,
their sum is zero and one has
<£ c f(z) dz + (j> c f(z) dz ® 0 , ( 6 - 2 )
where the integral along C\ is traversed in the counterclockwise direction
and that along C 2 in the clockwise direction. Changing the order of
integration in the second integral in (6-2) gives
<j> c f(z) dz » fix) dz. (6-3)
We see that the values of the integral of f(z) over two different paths
Ci and C 2 are equal, but they need not be zero inasmuch as J{z) may not
be analytic at every point of the region bounded by GV But whatever
may be the value of the integral over the path C 2 , it is the same as its
value over the path CY An important principle of the deformation of
contours follows at once from this observation: The integral of an analytic
function over any closed curve C\ has the same value over any other curve C 2
into which C\ can he continuously deformed without passing over singular
points of f(z).
We shall see that this principle will enable us to simplify the computation
of integrals of analytic functions.
The foregoing results can be extended in an obvious way to yield the
following theorem:
Theorem. If f (z) is analytic in a closed multiply connected region hounded
by the exterior curve G and the interior curves C u C 2} . . G«, then the in-
tegral over the exterior curve G is equal to the sum of the integrals over the
interior curves provided that the integration over all the contours is performed
in the same direction.
It should be noted that the requirement of analyticity of f(z) in the
closed region implied that }{z) be analytic on all contours forming its
boundary.
Before considering applications of the theorem of this section to specific
problems, we deduce an important result which will enable us to compute
many integrals by a method which is vastly simpler than that developed
in Sec* 5*
COMPLEX VARIABLE
m
[chap. 7
7. The Fundamental Theorem of Integral Calculus. Let f(z) be analytic
in a simply connected region R (Fig. 11), and let C be a curve joining two
points P 0 and P of the region determined
by the complex numbers zq and z. We con-
sider the integral
/7(*)*
Jea
(7-1)
along C. Since /(z) is analytic, the integral
(7-1) is independent of the path, and its
value is completely determined by the choice
of z Q and z. If z 0 is fixed, the integral (7-1)
defines a function
F(z) = f‘m dz (7-2)
J «0
for every choice of z in R.
To emphasize the fact that the integration variable z plays a distinct
role from the variable z appearing in the upper limit of the integral, we
can rewrite (7-2) as
m
rm«.
J t 0
(7-3)
We prove next that F(z) is an analytic function and, moreover, its
derivative at any point z has the value of the function in the integrand
at that point. That is,
F'(z) «/(*).
We can use (7-3) to compute the difference quotient
F{z + A z) — F(z)
Az
A z
i[C‘'K)«-ij<s)«]
1 r*+Aa
- /, m «, (M)
and rewrite (7-4) by adding and subtracting /(z) in the integrand:
F(z -f- Az) — F{z) 1 r*+A»
- — — — — — - ■ ■ sat
Az Az
1
* Az" ^ A z
[/(f) -f(z) + /(z)]df
/•*4-Aa 1 ft 4-A*
;[/(*)[ <*fl + -/ (/(f)
! J * Az Jz
m* t.
ANALYTIC ASPECTS
651
SBC. 71
d$ » Az, so that
Now if
F(z + Az) ~ F(z)
Az
A*) + ~ f +A ‘ t/(f) -/(»)!*.
Az •'*
lim ~ r +A '[/(f) -/(*)] dr -0,
A»-+0 Az
( 7 - 5 )
( 7 - 6 )
then it would follow from (7-5) that F'(z) « f(z). The fact that the limit
in (7-6) is, indeed, zero follows at once from the estimate (5-8), for if
M « max | f(t) — f(z) | on the path joining z and Az , then
1 re-
— / i/(f) -midi
Az
< M.
But since f{z) is continuous, M — > 0 as Az ~ > 0.
Any function Fj(z) such that F\(z) «= f(z) is called a 'primitive or an
indefinite integral of f(z) . As in real calculus, it is easy to prove that if
Fi(z) and F 2 (z) are any two indefinite integrals of /(z), then they can
differ only by a constant. 1
Hence, if F x (z) is an indefinite integral of /(z), it follows that
F(z) « ff(z) dz = F x (z) + C.
f* 0
To evaluate C, set z * z 0 ; then, since / f(z) dz » 0, C = — F^zq). Thus
-'do
F(z) = /*/(*) de = F,(z) - F,(*o). (7-7)
■/ZO
The statement embodied in (7-7) establishes the connection between line
and indefinite integrals and is called the fundamental theorem of integral
calculus because of its importance in the evaluation of line integrals. It
states that the value of the line integral of an analytic function is equal to
the difference in the values of any primitive at the end points of the path of
integration .
1 Proof: Since F[{z) » F£(z) « /(*), it is evident that
F[(z) - Fl(*)
d(F x - F 2 ) dO n
23 m 0 .
dz dz
But if dG/dz « 0, it means that G\z) « (du/dx) *f i(dv/dx) *» (dv/dy) — i(du/dy) ■» 0,
so that du/dx ** dt>/dx « du/dy - *■ 0, and thus v and t> do not depend on x
and y.
582 COMPLEX VARIABLE fCEAP« 7
Example I. As an illustration of the use of formula (7-7) consider the evaluation of
/>d* (7-8)
Jc
along some path C joining z • 0 and * » 2 *f i. Inasmuch as /(a) * a* is analytic
throughout the finite z plane, the integral (7-8) is independent of the path. Moreover,
Since F(f) — is an indefinite integral for f(z) — z 2 , we can write
I 2 * 1 1
r
z 2 dz * - z 3 I
:(2-M) 3 .
The reader should contrast this computation with calculations required for solving this
in Prob. 1, Sec. 5,
Example 2. Evaluate I e* dz over some path C joining z - 0 and t m vi. Since e* is
Jc
analytic, we get at once from (7-7)
r
e* dz
- 1 - - 2 .
We indicate the nature of required calculations if this integral were to be computed
by the method of Sec. 5. We first separate the integrand into real and imaginary parts,
s* m e • e* cos y + ie* sin y,
and form two real line integrals
I e* dz «* / (e* cos y + ie* sin j/)(dx -f i dy )
Jc Jc
«■» j (e*coBydx — z* sin y dt/) -f * f (e* sin y dx «f e* cos y dy).
Since these line integrals are independent of the path, they may be evaluated over any
convenient path joining the points (0,0) and (0,*-) corresponding to z «* 0 and z « x i.
The result of such calculations would yield -2, as the reader can verify.
Example 3. Discuss the integral J (z — a) m dz, where m is an integer and a is a
constant.
The function f(z) ~ (z ~ a) m is obviously analytic at all points of the z plane as long
as m is a positive integer. If m < 0, we write m * -n and consider
/(*)
1
where n is a positive integer.
(z ~ a) n
To evaluate
f (z — a) w dz for m > 0, we note that
*° (z — a) w+1
P(i) - V '
m -f 1
is an indefinite integral for /(z) =* (z - a) m , Accordingly
A. m + I L
(7-9)
(7-10)
If, in particular, the path C is dosed, so that the limits in (7-10) coincide, we conclude
that the value of the integral is zero. This result also follows from Cauchy’s theorem,
since /(*) « (* — a) m isranalytie for all values of z when m > 0.
SEC* 7 ] ANALYTIC ASPECTS
We consider next the integral
683
r dz
Jc (« — *
and note first that if the path C passes through the point z » a, the integrand becomes
meaningless at z ** a. In this book 1 we shall not consider in detail integrals over those
paths that go through singular points of the integrands, but special types of such inte-
grals will occur in Sec. 22.
If C is a closed path and a is not in the region R enclosed by C, the integrand in (7-11)
is analytic in the closed region R, Hence, by Cauchy's theorem the value is zero. If,
however, a lies in R , Cauchy’s theorem
does not apply, since f(z) ** l /(* — a) n u?
ceases being analytic at z ■» a. The inte- f
gral (7-11) can, of course, be evaluat'd by I \
the method of Sec. 5 once the equation of y \
C is specified. However, it is wise to sim- "N \C
plify calculations by making use of the ) s — \
principle of deformation of contours. This J ( s\ \
principle states that when z — a is in the / V a y \
interior of C, f N |
r c (z~a) n T y (z-a) n
where 7 is a circle with center at a and
with radius p so small that 7 lies within C "q
(Fig. 12) But the integral over 7 is easily
evaluated. Setting z — a ** pe l6 , we get
dz = pe^i dd on observing that p is constant on 7. Hence
y dz y pe' 6 i d$ _ i f' 1
Tc ” Ty * p"- 1 Jo
r c (z-~ar t
If n ** 1, we get
e (l-n)0i ^
g(l-~n;(h 2*-
1 t(l - n) 0
if n it 1.
I — -if
Jc z — a Jo
dO « 2 iri.
In evaluating the integral (7-12), we noted that the integrand e (i ~ n)et dd \ for n 5^ 1, is
the differential of e a ~ n)9% /i(\ ~ n), and we made use of the fundamental theorem of
integral calculus.
J r dz
~r , where C is the circle x 2 4* y 2 * 4.
c * — 1
The function
m
1 _ 1
~ 1 % - 1)(* + 1)
(7-14)
1 When z » a lies on the path of integration, the integral in (7-11) is an improper
complex integral and it calls for special considerations analogous to those required to
treat improper real integrals. Certain types of improper complex integrals are of in-
terest in applications. See, for example, N. I. Muskhelishvili, “Singular Integral Equa-
tions/’ P. Noordhoff, N,V., Groningen, Netherlands, 1953.
COMPLEX VARIABLE
554
fCHAP. 7
has two singular points t — 1 and «■»—!, both of which He within the given circle
|*| < 2 (Fig. 13). If we delete these points from the circular region <7 by circles 71 and
of sufficiently small radii, /(*) will be analytic in the triply connected domain exterior
to 7 i and 7 * and interior to C. Then Cauchy's theorem for multiply connected domains
permits us to write
f /(*) dz~[ /(*) d* + f m (MW
Jc jy ! Jyt
The integrals in the right-hand member in (7-15) are readily evaluated. Since
we get
1 11 11
(z- 1 )(z + 1) “ 2 z - 1 ” 2 z~+T
L
1
(Z - 1 )(* + 1)
1 f dz
2 J yi z + 1
(7-16)
If the radius of 71 is such that 71 contains within it z * -f 1 but not z « — 1 , then
by (7-13), and
L
L
dz
z + 1
2m f
0,
by Cauchy’s integral theorem, for l/(z + 1 ) has no singularities within 71 .
first integral on the right in (7-15) has the value n. An entirely similar
shows
Therefore,
ly , (* ~ 1 )(* + 1 ) ^
Ic {* - IK* + l ) d * " M
Thus, the
calculation
even though the integrand is not analytic in the region | z j < 2.
SEC. 8 J
ANALYTIC ASPECTS
f
555
PROBLEMS
1. Show that f zdz ** %(& — sg) for all paths joining zq with *.
-'so
2. Evaluate the integral I (z a)*” 1 dz, where C is a simple closed curve and a
Jc
is interior to C, by expressing it as a sum of two real line integrals over C. Hint; Set
z — a «■ pe**; then dz «* e? l (dp 4" ip d$).
3. Evaluate / z~ 2 dz where the path C is the upper half of the unit circle whose
Jc
center is at the origin. What is the value of this integral if the path is the lower half of
the circle?
4. Evaluate / z~~* dz, where C is the path of Prob. 3.
Jc
6. Evaluate / ( z 2 — 2z -f 1) dz, where C is the circle x* -f V 1 ** 2.
Jc
r z 4 1
6. Discuss the integral / — =— dz , where C is a path enclosing the origin.
Jc 2
7. What is the value of the integral / (1 -f z 2 )^ 1 dz, where C is the circle x* -f y* m
Jc
8. Discuss Prob. 7 by noting that
1
9 ?
1 ■+- 2 2
I(i U
2 i\z — t z 4 * i/
and evaluating the integrals over the unit circles whose centers are at * — i and z «■ — i.
Note the theorem of Sec. 6.
9. Show that the integrals (a) f • - , (b) f sin z dz, (c) f ze* dz, (d) f z~ 2 dz vanish
Jc z — 2 Jc Jc Jc
if C is the unit circle \z \ « 1.
10. Evaluate the integral f — - — dz along the following paths C: (a) \z\ **
Jc 1—2
(b) \z | “ 2, (c) \z — 1 1 « 1, (d) \z 4* 1 1 « 1. Hint: Decompose the integrand into partial
fractions as in Prob. 8.
8. Cauchy’s Integral Formula. In this section we deduce with the aid
of Cauchy's theorem the remarkable fact that every analytic function /(s)
is completely determined in the interior of the given closed region R when
the values of f(z) are specified on its boundary.
Let f{z) be analytic in a simply connected region R and on its boundary
C. If a is an interior point of R , then the function
m
z — a
(8-1)
is analytic in R with the possible exception of the point z * a. If this
point is excluded from the region by enclosing it in a circle y of radius p
and with center at a (Fig. 12), then (8-1) will surely be analytic in the
region exterior to y and interior to C.
COMPLEX VARIABLE
[chap. 7
It follows, then, from (6-3) that
m , f m
' z
J c z ~ a J y i
dz
(8-2)
where the paths (7 and y are described in the same sense. Now the integral
in the right-hand member of (8-2) can be written as
*y z — a J y
m -m
z — a
dz + f (a) —
dz
But by (7-13)
ly z
dz
= 27 ri y
(8-3)
(8-4)
and we shall show next that the first integral on the right in (8-3) has the
value zero. Indeed, if we take z — a = pe %e , then, as long as z is on 7,
dz = ipe 10 d$ y and therefore
/ / (2 ) /( - a) dz = if \}(z) - /(a)] dd.
J y z ~ a J y
Let the maximum of | f(z) — f(a) | be M; then by (5-8)
L
f(z) - f(a)
dz
/•2t
< M / d$
Jo
2rM.
(8-5)
(8-6)
The radius p is arbitrary, and if we make it sufficiently small, then
max |/(z) — /(a) | can be made as small as we wish, since f(z) is a continuous
function. Accordingly, M — ► 0 as p — ► 0 On the other hand, from the
principle of deformation of contours, the value of the integral (8-0) is
independent of the radius p. Since M —> 0 when p — * 0, we conclude
that the value of the integral (8-5) is zero.
Accordingly, (8-3), together with (8-4), gives the result
f /(«)
Jc z — a
dz = 2t if (a).
(8-7)
We recall that the point a is any interior point of the region R bounded
by C and z is the variable of integration on the contour C. If we denote
the variable of integration by f and let z be any interior point, we can
rewrite formula (8-7) as
/(*)
( md *
2ri f — z
(8-8)
Formula (8-8) permits us to calculate the value of f(z) at any interior
pbint from specified boundary values /({*) on the contour C. It is known
ANALYTIC ASPECTS
SEC. 8}
557
as Cauchy’s integral formula. This formula can be extended in the man mr
of Sec. 6 to multiply connected domains bounded by the exterior contour
Co and m interior contours C i, C 2 > . . . > C m . The integration in (8-8) is
then performed in the clockwise sense over the interior contours and
counterclockwise over the exterior contour Cq.
It is not difficult to show with the aid of formula (8-8) that an analytic
function f(z) has not only continuous first derivatives in the region but
also derivatives of all orders. Thus an analytic function can be differen-
tiated infinitely many times.
In fact, if we consider an integral of Cauchy’s type y
F(z)
1
2 iri
i h r
/(f)
(8-9)
where /(f) is any continuous (not necessarily analytic) complex function,
then this integral defines an analytic function F(z). To show this we
merely have to prove that F(z) has a derivative at every point of the
region R bounded by C. We form the difference quotient with the aid of
(8-9) and get
F f (z)
lim
0
F(z + A z) — F{z)
Az
Dm ±[*>* 1
az — * o Az L 2 iri f — (z + Az) 2 wi f — z J
lim r Li — m* — 1.
Az -+ 0 L27Ti 1C (f — Z — Az)({ — z) J
On taking the limit as Az — ► 0 under the integral sign, which is legitimate
if /(f) is continuous, we get
r M
-/
2 ri J (- (f
Continuing in the same way, we find
m
z)'
A-
F"(z)
21
2 iri
f /(f)
JC (f - 2 )3
df,
F <n) (z)
_»!_ f /(f)
2tTI dc (f - 2)" +1
df.
We have thus shown that F(z) defined by (8-9) has derivatives of all
orders even when nothing is said about the relation of the values of F(z)
on the boundary C to the function /(f) appearing in the integrand. In
COMPLEX VARIABLE
558
(chap. 7
the special case when /(f) * F(f), we have a formula for the nth derivative
of the analytic function /(z) at any interior point of R in terms of the
values of /(z) on C :
/<»>(*)
nl r /
2ri Jc (t -
/(f)
2 « J c (f - z) n+1
d f,
0 , 1 , 2 , ....
( 8 - 10 )
We conclude this section by noting some important consequences of formula (8-7).
Let. the path C be the circle 1 1 — a | «* p with center at z — a and with radius p. Sup-
pose that the maximum value of the modulus of f(z) on this circle is M; then by (6-8)
l/(a)l<~-2rp
2 IT p
M.
This result is independent of the radius p. Consequently |/(z) | at the center a of the
circle is not greater than its maximum value on the boundary. Using this result one can
prove that if f(z) is analytic in a given region R bounded by a curve C, and if M is the
maximum value of | f(z) | on C, then \f(z) | < M at each interior point of R unless
|/(z) | «* M throughout the region. This result is known as the maximum modulus
theorem .* The fact that |/(*)|< M follows from Sec. 24, Chap. 6, if we note that
log \}{z) | is harmonic.
Example 1 . Find the value of the integral f — - dz if C is the ellipse x 2 -f 4y 2 ** 1 .
Jc z
Since sin z is analytic in the region bounded by C, formula (8-7) yields, upon setting
f(z) — sin z and a — 0,
f sin z
JC *
Example 2. Evaluate the integral j
Jcz + l
The point z — — 1 lies within the given circle, and since e
formula (8-7) yields
f I
/ — — dz — 2i tie ‘ — e2w i.
Jc* 4- 1 U i
dz over the circular path | z | » 2.
is analytic within C,
Example 3. Find the value of the integral
tan z
L
TjCfe,
Jc 1* ~ (ir/4)] 2
where C is the circle |s| — 1.
The point * - tr/4 lies within C , and we note that tan z is analytic for \z | < 1. From
( 8 - 10 )
™-hh m
Hence
idz.
2ri J (z — a) 2
/ tan * . „ . fd tan z\ oT
_____ * - 2« (— -j < - 2« sec® - - 4«.
r «- T/4
‘See proof, for example, in E. C. Titchmarsh, “The Theory of Functions,” 2d ed.,
p. 164, Oxford University Press, London, 1039.
SEC. 9}
ANALYTIC ASPECTS
559
t
PROBLEMS
r -f 1
1. If f(z) «* J - <ff , where C is the circle of radius 2 about the origin,
find the value of /(I — t).
2. Apply Cauchy’s integral formula to Prob. 7, Sec. 7. Use the integrand in the
form given in Prob. 8, Sec. 7.
8. Evaluate the following integrals over the closed path C formed by the lines x • dfcl,
t sin z F cos z F g* f
y — ±1: (a) / dz, (6) / dz, (e) / — dz, (d) / (sin * + «•) d*.
Jc 25 /C * JC 2 ~ 3^1 7c
F cosh z ,
M /
dr *
4. Evaluate with the aid of Cauchy’s integral formula
/• gr + r , f
Jc? - 1
where C is the circle |f | — 2. Hin/: Decompose the integrand into partial fractions.
6. What is the value of the integral of Prob. 4 when evaluated over the circle | f — 1 1
* 1? Hint: Note that (3f 2 + f)/(f + 1) is analytic for |£ — 1 1 < 1.
/*3z 2 -f 2z “ 1
8. Evaluate / dz, where C is the circle \z \ » 1.
Jc 2
7. Can j f(z) | as same a minimum value at an interior point of a region within which
f(z) is analytic? Consider f(z) *• z.
8. Can | f(z) | assume a nonzero minimum at an interior point of a region within which
/(z) is analytic? Hint: Consider l//(z).
9. Harmonic Functions. We saw in the preceding section that a function
analytic at a given point of the region has derivatives of all orders at that
point. It follows from this that the real and imaginary parts of an analytic
function }{z) = u + iv have partial derivatives of all orders throughout
the region where f(z) is analytic, for by (4-5) and (4-6)
/'(*)
du dv
h i —
dx dx
and since f'(z) is also analytic,
/"«
d 2 u d 2 v
l i
dx 2 ^ dx 2
d 2 v
dv
dy
d 2 u
du
dy
d z U
d 2 V
dx dy dx dy dy 2 dy 2
The fact that } n (z) is analytic enables us to differentiate again to obtain
the third partial derivatives, and so on.
Inasmuch as the existence of the third partial derivatives ensures the
equality of mixed partial derivatives of the second order, we can show
that the real and imaginary parts of an analytic function satisfy Laplace's
equation throughout the region of analytieity of f(z); for on differentiating
560 COMPLEX VARIABLE [CHAP. 7
tiie first of Cauchy-Riemann equations (4-7)with respect to y and the second
with respect to x, we get
b 2 u
by bx
and adding these we find
b 2 v
ay 2 ’
b 2 v
bx 2
b 2 V b 2 V
b^ + by 2
0.
The fact that u also satisfies Laplace's equation
0
b 2 u b 2 u
b^ + b^
d*U
dx dy
follows similarly from the differentiation of the first of Eqs. (4-7) with re-
spect to x and the second with respect to y.
Any real function u(x,y) with continuous second partial derivatives
which satisfies Laplace’s equations in a given region is called harmonic
in that region. Thus the real and imaginary parts of a function analytic
in the region R are harmonic functions. Two harmonic functions u(x,y),
v(x,y) such that u + iv is an analytic function f(z) are said to be conjugate
harmonics. We shall show next that if one harmonic function is given, its
conjugate harmonic can be determined to within a constant of integration.
For, let u(x f y) be given in R. Then if v(x y y) is a conjugate harmonic, these
functions satisfy the Cauchy-Riemann equations
du dv
du
dv
dx dy
% “
dx
dv
bv
du
du
Hence
dv = —
dx dy
= —
— dx H dv
dx
dy
dy
dx
and, since du/dx and du/dy are know from u(x f y) f we have
r(x,v) /
du
du \
v(x,y)
= / ( -
■ — dx 4*
— dy jf
* (*o*Vo) \
dy
dx /
where the integral can be evaluated over any path joining an arbitrary
point (x 0 ,yo) of R with ( x,y ). Since the value of the line integral (9-2)
depends on the choice of (xo>2/o)> *t 18 c ^ ear that v(x,y) is determined only
to within an arbitrary constant. The integral is independent of the path
inasmuch as
by V by/ bx \bx/
and this equation is true because u(x,y) is harmonic. It should be noted
ANALYTIC ASPECTS
EC. W]
mi
hat when the region R is not simply connected, the function v(x,y) may
um out to be multiple-valued. 1
The connection of analytic functions with Laplace s equations is one
& the principal reasons for the importance of the theory of functions of
omplex variables in applied mathematics.
In the preceding section we noted the maximum modulus theorem for
inalytic functions. This theorem enables us to prove the important fact
hat the maximum values of harmonic f unctions ( which are not mere constants )
ire invariably assumed on the boundary of the region .
Let u be harmonic in the region R whose boundary is C. If v is a con-
jugate harmonic, then u + iv is an analytic function, and therefore the
function
e u+%v _ e «( cog v -f- 1 sin v )
[s also analytic. But the maximum of \e u + tv \ ss e u is assumed on the
boundary C of R by the maximum modulus theorem. Since e u takes on
its maximum on the boundary 0, u(x,y) must assume its maximum on C.
Example' The function u ** x 2 — y 2 is harmonic in every region. Obtain a conjugate
harmonic v.
Inserting u in the formula (9-2) yields
Ax.v) A*,v)
v(x t y) « / (2 y dx -f 2x dy) « 2 / d{xy) * 2 xy + c,
•f(xo-Wo) ■'(aro. Vo)
where c -~2xoyo.
In this problem the integrand is so simple that we wrote its differential by inspection,
hi a more complicated case it may prove more expedient to evaluate the integral over
some convenient path rather than reduce the integrand to the form of a differential of
some function.
PROBLEMS
1. Prove that v « 3 x 2 y — y 8 is harmonic, and find a conjugate harmonic u.
2. Find an analytic function /(a:) * u 4- iv if:
(a) u *» x;
(b) u «* cosh y cosx;
(c) u m x /(x 2 4* y 2 )i
(d) u ** e* cos y ;
(<?) u « log Vx 2 4- y 2 .
10. Taylor’s Series. In this section we are concerned with the power-
series representation of analytic functions. The reader is advised to review
Secs. 8, 9, and 16 of Chap. 2 dealing with the properties of power series.
00
Here we recall that when the power series X a ^ k converges for z ® z lt
ktO
it converges absolutely and uniformly in every closed circular region
\z\ < r, where r < |zi| . A circle of radius r such that Sa* 2 * converges for
1 See Bee, 5, Chap. 5.
COMPLEX VARIABLE
562
(chap. 7
|*| < r and diverges for every \z\ > r is called the circle of convergence,
and the number r is the radius of convergence . The radius of convergence
can frequently be determined with the aid of the ratio test. Thus
lim
whenever this limit exists. 1
a n -i
00 ( — 1 ) n z n
Example: The series — has the radius of convergence r
n- 1 n
1, since
lim
Qn-l
On I
The series 2 n k* converges only for z ■
n— 0
Urn |2=i
lim
1
1.
n — * « n
0, since in this case
1
lim
► « n
« 0.
_ g
On the other hand, the series 2J — - converges for all values of r, since
»~o nl
lim
n —+ «
1
On
«*> lim
(n — 1) !
We saw in Sec. 9, Chap. 2, that with every real function /(x) having deriv-
atives of all orders at a given point x = a, we can associate the power series
00
X) a»(x - a) n
n«*0
with a n = f {n) (a)/nl which usually converges to /(x) in some interval
about the point x = a. However, the existence of infinitely many deriva-
tives at x = a does not ensure the
convergence of the series Xa n (x — a) n
to /(x). To ensure convergence, the
remainder in the Taylor formula (9-1)
of Chap. 2 must approach zero.
Inasmuch as every function J{z)
which is analytic at z = a has in-
finitely many derivatives at that
point, we can write down the series
* f (n) (a)
£ — 12 (,-«)*
n« 0
which converges in some circular
region \z — a| < r. The question is:
Does such a series invariably converge to /(z)?
We prove next (in contradistinction to the situation with the correspond-
1 See See. 8, Chap. 2.
563
SBC. 10 } ANALYTIC ASPECTS
mg real aeries) that analytic functions can always be represented by power
series.
Let f(z) be analytic in some region fi, and let C be a circle lying wholly
in R and having its center at a. If z is any point interior to C (Fig. 14),
then it follows from Cauchy's integral formula that
/(*)
2iri f — 2
iUr^fn T ( .-l)/ (r -J ^ (1W)
But by long division
1
i -M + r + •• •+r~ I +
t n
i - 1 i - 1
and substituting this expression with t = (z — a)/(f — a) in (10-1) leads to
/(z) = — /. ; df + (* - a) I — <# + •••
2m L'Cf - a J c (f - a) 2
(* - a ) n r /(f)
of-
where
fin
Sc
/(f)
2« ^ ( f - a )"(f - 2)
+ fin.
Making use of (8-10) gives
/(z) “ /(<*) +/'(«) (z ~ a) +
/"(«)
2!
(z - a) 2
/("-!)(„)
•+ 7 (z - a) n_1 + fin- (10-2)
(n - 1)!
By taking n sufficiently large, the modulus of fi n may be made as small
as desired. In order to show this, let the maximum value of |/(f) | on C
be M, the radius of the circle C be r, and the modulus of z — a be p. Then
If — z|>r — p, as shown in Fig. 14, and
f—M — «|
JC (!■ - „W- ' 5 1
|fi»l
2*
Jc (f ~ °) n (f ~ z)
p B M2wr
Mr /p\"
r — p \rj
2x r"(r - p) r - p
Since p/r < 1, it follows that lim | fi n | = 0 for every z interior to C. Thus,
564 COMPLEX VARIABLE {CHAP. 7
one con write the infinite Beries
f ff (a) f (a)
m - /(«) + /'(•)(# - a) +'-~ {z - a) 2 + • • • +'— — (z - o)" + • • •
2! m
(HW)
which converges to f(z) at every point z interior to the circle \z — a\ * r.
The series (XO-3) is the Taylor series of /(z) expanded about the point
* « a. As in Chap. 2, Sec. 9, one can prove that the representation (10-3)
is unique.
Let z a* zo be the singular point of f(z) nearest z - a; then/( 2 ) is analytic
in the circular region \z — a \ < r 0 , where r 0 ~ \z {) — a\. This circular
region will then be the circle of
convergence of the series (10-3) in-
asmuch as the series diverges for
1 2 — a | > tq. It should be noted,
however, that there may be points
of the region R where f(z) is analy-
tic which lie outside the circle of
convergence of this series. How-
ever, one can always choose a new
point a about which the expansion
is performed so that the circle of
convergence of Taylor’s series about
that particular point contains with-
in it the desired value of z as long
as }{z ) is analytic at z. In this
manner the region R can be covered
Fia, 15 by a set of overlapping circles each
of which is associated with some
T&ylor-series representation of f(z).
For example, if f(z) * 1/(1 — z), then the expansion of }{z) about 2 ** 0 yields
/(*) *1+2 + 2*+....
The circle of convergence of this series is \z \ ** 1 . But f(z) ** 1/(1 — z) is analytic at
f * (%)% (Fig. 15), which lies outside the circle \z\ * 1. If we take a «* t, the formula
(10*3) yields the series whose circle of convergence is |z — t| « \/5, and this circle in-
cludes the point z - (J£)t. The reader may find it instructive to deduce the expansion
for f(z) ** 1/(1 — z) in powers of z — i and determine the radius of convergence with
the aid of the ratio test.
PROBLEMS
1. Expand /(z) «* 1/(1 - z) in Taylor’s series about (a) z - 0, (5) z «■ -1, (c) z * i,
and draw the circles of convergence for each of the series. What relation do the radii
of convergence of these series bear to the distance from the point 2 «« 1 to the point
about which the series exoansion is obtained?
566
SEC, 11} ANALYTIC ASPECTS
2. Expand /(*) » log z in the Taylor aeries about t ■* X, and determine the radius of
convergence.
3. Obtain the Taylor expansion about t » 0 for the following functions, and deter-
mine the radii of convergence of the resulting series: (a) «*, (6) sins, (c) cos*,
(d) log (I 4- *), («) cosh *.
4. Expand /(*) — sinh * in Taylor’s series about the point * «* vt, and determine the
radius of convergence of the resulting series.
6. Discuss the validity of the expansion (I 4* z) m m 1 -f mt + [m(m — l)/2!]* s *+■ * * •
for arbitrary values of m.
6. Verify the expansions:
(a) 4-i> + l)(* + l)" for |* + 1 1 < 1,
* n-0
» (* _ \\n
(i b ) e* ** e X i — for 1*1 < °°*
»«o n!
11. Laurent’s Expansion. We have just shown that a function f{z)
which is analytic at a given point a can be represented in the neighbor-
hood of that point in a power series. Moreover, this series represents
j(z) in the interior of the circular region centered at a and whose radius is
equal to the distance of a from the nearest singular point f(z). In this
section we prove a more general theorem due to Laurent.
Laurent’s Theorem. A function f{z) analytic in the interior and an the
boundary of the circular ring determined by \z — a\ — R\ and \z — a\ «= R%,
with R ‘2 < R\ (Fig. 16), can be represented at every interior point of the ring
in the form
/(*) = E «•(* - a) n + 2 ~ n ’ (11-1)
n-0 n~l (« - a)"
where a n = — <fi df, n « 0, 1, 2, (11-2)
(f — a) n ^ 1
Fig. 16
566
COMPLEX VARIABLE
[chap. 7
L 4
2m JCt
m
2iri r Ct (f - a )-~n-M
d[ f n * 1, 2,...,
(11-3)
Ci and C 2 being the boundaries of the ring.
To prove the theorem we recall that Cauchy’s formula (8-8), when
applied to the circular ring, enables us to write
m
1
2 ri
f — z 2 ri'Ct f — z
(11-4)
where 2 is any point in the interior of the ring.
We show next that the integrals in the right-hand member of (11-4) can
be represented by the series appearing in (11-1). We begin with the
integral over Ci and note that if £ is on G 1 and z is in the ring, then
1 1 1 1 " (z - a) n
£ - z f - a 1 - (z ~ a)/(£ - a) £ - a (£ - a) n
since 1 |z — a|/|f — a| < 1. Thus,
and hence
J / /(£) dr
27rt 'Ci £ - 2
df
^ (z~a) n
n-0 (T - a) n+i
y fjm - °r
2w®»„tj (f — a) B+1
dr.
(11-5)
Since integration of the series term by term can be justified as in the dis-
cussion of (10-1), we can write
_1_I m) dr
dr
- £ (* - «)»$ —
Trt - r ci rr -
/(r)
C. (f - a)
-»+i
dr
00
E a„(? - a)",
0
where we define a n hy the formula (11-2). This establishes the equality of
the first terms in the right-hand members of (11-1) and (11-4).
We consider next the second integral in (11-4). If f is on C 2f then
1
1
f — z z — a 1 — (£ — a)/ (z — a)
since |f — a|/|z — a\ < 1 in this case. Hence
- E
(r - <*)"
- n^ n+1 ’
n— 0 (2 u) 1
m
dr -
1
00
<k E
/(r)(r - o>»
‘Note
2«7c, f - 2 ' 2« ' c« (z - a) n+1
th “ t r~i“ ) 5 , * iJI 1,1 < h
dr
?
567
SBC. 11] ANALYTIC ASPECTS
and the integration of the series term by term now yields
1
/(f)
2 ri fCt f — 2
df
2wi n Zi> (z - a) n+1
o_,
- E
where we set
Q-n =
n-I (2 ~ a)
/(f)
-U
2x1* JCt (f - a)~ n+1
df, n » 1, 2, ....
This establishes the equality of the second terms in the right-hand members
of (11-1) and (11-4), and the theorem is proved.
We note that if f(z) is also analytic in the interior of the circle C 2 , then
the integrand in (11-3) is an analytic function and hence a_ n = 0 by
Cauchy’s integral theorem. In this case (11-1) reduces to the Taylor
series, since
At) ^ f (n) (a)
2ri f Ci (f — a) n+1 n\
by (8-10).
We can write the series (11-1) more compactly as
a n = /-.<£
f(z) = £ a n(z -
( 11 - 6 )
where the a n can be computed from the formula
n -°’ ±1 ’ ±2 '- (U - 7)
and r is any simple closed path 1 which lies in the ring and encloses C 2 .
It is possible to prove that the representation of f{z) in a given circular
ring in the series (11-6) is unique. 2 Hence if one obtains for f(z) a repre-
sentation
/(*)- E K(z - aT
in a certain ring with the center at a, the coefficients b n in this representa-
tion must be identical with those given by formula (11-7). This frequently
enables one to deduce the Laurent series without evaluating the integrals
(11-7).
* Recall that the integrals (11-2) and (11-3) have the same values when calculated
over any path r into which Ci and C% may be deformed without leaving the ring.
* See, for example, E. C. Titchmarsh, “The Theory of Functions/' p. 101, 2d ©d.,
Oxford University Press, London, 1939.
668 COMPLEX VARIABLE [CHAP. 7
For example, let /(*) — «•/**» and let it be required to obtain the expansion
Since e* ■■ 1 4 * 4 (f*/2I) 4 * • • 4 (s n /n!) 4 * * • , we have for any * s* 0
*
z*
4 — 4
This is a Laurent expansion about the origin; hence it is the Laurent expansion about
the origin.
The Laurent expansion for e llg , valid for all \z | > 0, can be obtained from the series
e* *■ 1 4 u 4 (m 8 /2!) H by letting u - l/z.
As another illustration, consider
/(*)
t
(* — 1)(* — 3)
( 11 - 8 )
This function has two singular points: t - 1 and z=>3. To obtain the Laurent series
00
2 dn(z — 1)** valid in the neighborhood of z » 1, we can proceed as follows. Set
<f>(z) m %/{z ~ 3), and expand <*>(*) in Taylor's series about z » 1. The result is
z
3
1 3 f (j zUl
2 Ai 2 n+1
(n-9)
Since * «■ 3 is a singular point of <*>(*), we conclude that (11-9) converges as long as
\z — 1 1 < 2. On multiplying this series by l/(z - 1), we get
» 1 f (* - D w ~ l
(a - 1)(* - 3) ~ 2(z - 1) »-i 2 n + l ’
which is valid for 0 < \z — 1 1 < 2.
To obtain the expansion of /(*) in (11-8) about z « 3, we set <p(z) «•> z/(z — 1), expand
it in Taylor's series about z — 3, and multiply the result by l/(z — 3).
The expansion for /(z) in (11-8) valid for |z| > 3 can be deduced as follows: We de-
compose /(«) into partial fractions and find
* « lH _L M
(« - l)(t - 3) " z - 1 + z - 3*
( 11 - 10 )
But
1
1 1
_i|
(*+;
+ i+~)
for \z \ > 1
*-l *
' * i - a/*)
and
1
\ i
-i|
(<+;
+5+-)
| for |z| > 3.
* — 3
■ * 1 - (3/*)
Substitution of these series in (11-10) yields the desired expansion.
The reader may find it instructive to obtain the same expansion by writing
m " (* - 1)(* - 3) " ~t 1 - (1/.) 1 - (3/*) (U41)
and forming the product of the appropriate series for the factors in the right-hand
member of (11-11).
SEC. 12]
ANALYTIC ASPECTS
669
PROBLEMS
1. Obtain Laurent’s expansions for /(f) *» 1/[*(1 — *)*): (a) about f *• 0, (6) about
s « 1.
2 . Obtain Laurent's expansion for e~ 1/,s valid for \z\ > 0.
8. Expand in Laurent's series about z «* 1: (a) (x — I)®, ( b ) l/(x — 1)* # (c) (s — l) 1 *f
»/(i - i)*l.
A Obtain Laurent’s expansion for /(f) « l/((f — 1)(* — 2)) valid in the following
regions: (a) |s — 1 1 < 1, (fr) |f | > 2, (c) 1 < \z | < 2. Note that in (6) and ( c ) the de-
oo
sired expansions have the forms 2 &»*** Show that
m m
and
1 m ly l
z - 1 “Intif"
for |f| > 1,
* — 2 2 \ 2 ,
#. Show that /(z) - 1 /[**(!
— z)] has the following expansions:
(o) 53 * n ~ 2 , valid for 0 < \z | < 1,
0
(f>) £ va'id for 1*1 > 1.
n— 0 f
for |f|> 2.
12* Singular Points, Residues. If z = a is a singular point of an
analytic function f(z) and the neighborhood of z = a contains no other
singular points of f(z), the singularity at z = a is said to be isolated .
Thus, f(z) = l/z has an isolated singular point z — 0 because the region
| z | = p > 0 contains no singular points other than 2 = 0 within it. The
function
/(*)
z — 1
z(z 2 + 1)
has three isolated singular points: z = 0, z » t, z « — i. The function
/(*) *
has two isolated singular points: 2=1 and z — — 1. Not all singular
points of analytic functions are isolated, however. For example,
m
i
sin (I/ 2 )
( 12 - 1 )
has a singularity whenever 2 = db(l//cir), k = 1, 2, — These singular
points are isolated. But (12-1) also has a singular point 2 = 0, which is
not isolated, for, no matter how small the radius p of the circle ] z\ « p may
be, this circle contains infinitely many singular points z « db(l/feir) in its
interior.
570 COMPLEX VARIABLE [CHAP. 7
The function log* has a singularity at * » 0, and so does V*. These
singularities are not isolated because every circle \z \ * p includes part
of the positive real axis, upon crossing which the single-valued branches of
log * and \Tz suffer discontinuities if the real axis is chosen to be the cut,
as in Secs. 16 and 17. The points at which the branches of a multiple-
valued function assume equal values are called the branch points. For the
present we shall restrict our considerations to single-valued functions.
If t m a is an isolated singular point of /(*), then in the neighborhood
of * — a the function f(z) can be represented by the Laurent series
/(*) -£«.(*- a)" + Z (12-2)
»-0 n-1 (Z ~ <*)
Some coefficients in (12-2) may vanish, and there are two nontrivial cases
that present themselves:
1. The expansion (12-2) contains at most a finite number m of terms
with negative powers of z — a, so that (12-2) reads
/(*)
06
Z o»(2 - a)* +
»— 0
a-2
(z - a) 2
••• +
O— «>
V-ar'
(1M)
2. The expansion (12-2) contains infinitely many terms with negative
powers of * — a.
The type of singularity at * * a characterized by the representation
(12-3) is called a pole of order m. A pole of order 1 is also called a simple
pole. When the expansion (12-2) has infinitely many terms with negative
powers of z ~ a, the point z = a is called an essential singular point of /(*).
We shall see in Sec. 14 that the behavior of a function in the neighborhood
of a pole differs radically from that at an essential singular point.
We note from (12-3) that whenever f(z) has a pole of order m, one can
define a function 1
*00 - (* - a) m f(z), z * a,
4>(a) » a_ m ,
which is analytic at * «= a, but the function (z — a) m ~~ l f(z) is not analytic
at * « a. This property is used sometimes to define a pole of order m.
The coefficient a_i in the Laurent representation (12-2) of f(z) in the
neighborhood of an isolated singular point * * a plays an important role
in the evaluation of integrals of analytic functions. This coefficient is
called the residue of f(z) at z = a.
When the singularity at z • a is a pole of order m, the residue at a can
1 When t m a, the function <£{*) assumes the indetenninate form 0/0. We agree to
define *(a) * lxm <K*).
t
BBC. 12] ANALTTIC ASPECTS 571
be determined without deducing the Laurent expansion. Thus, on multi-
plying (12-3) by (z — a)"*, we get
*(*) » (2 - o) m /(z)
- o_m + 0-_m.fi (2 - a) H + o_i (2 - a)™ -1 + ao(z - a) m -|
(12-4)
where a_ m p* 0. Since this is a power-series representation of $(z), the
coefficient o_i in it must be the coefficient of the term (z — a)”* - " 1 in the
Taylor expansion of 4>{z) about z = a. Thus
1 tr- l [(z - o)"/(z)]
(m - 1)! dz”*- 1
(12-5)
We formulate this result as a useful theorem :
Theorem. If <t>(z) = (z — a) m f(z) is analytic at z = a and <t>(a) p* 0,
then f{z) has a pole of order maiz — a with the residue given by (12-5).
As a special case of this theorem we note that when the pole at z = a
is simple, the residue at a is given by the formula
a_i = lim /(z)(z — a).
( 12 - 6 )
Example 1. Obtain the residues at the singular points of /(z) — (1 + *)/[*(2 — *)].
This function has a simple pole at t — 0 inasmuch as
*00
1 + 2
* z( 2 — z)
L±i
2—2
is analytic and does not vanish at z * 0. Also
<*>(*)
(2 -
2 )
1 +*
2(2 - 2 )
L±_i
2
is analytic at z * 2 and does not vanish for 2*2. Hence /(«) also has a simple pole
at 2 * 2.
The residues at these points can therefore be computed with the aid of the formula
(12-6). We find that the residue at z * 0 is H end at z * 2 it is —
Example 2. The function
i*/ N ^ e$
m - ( , + 1 - {z + i){z _ i}
obviously has simple poles at z
a~i * lim (z — i)
— » and 2 * t. Therefore the residue at z * t is
e* t z t l
* lim — — * — •
(z -f i)(z — t) k~+ %z ~ f i 2t
Similarly, the residue at z * — t is found to be —«~*/2 j.
Example 3. The function /( 2 ) * l /[*(2 -f 1)*] has a simple pole at z *• 0, since
*(*)
1 1
**<* + 1)« “ (* + l)*
572 COMPLEX VABUBUB
is stwiytkf at * - 0 and $(0) i* 0. Therefore, the residue at s — 0 is
[chap. 7
The singularity of /(s) at t
a_i «■ lim
; - 1 .
+ 1 )*
*■ — 1 k a pole of order 2, since
<Kz) « (* + l ) 2
is analytic at t -» — 1 and *(~1) * — 1.
a m -*1 with the aid of (12-5). We get
<*-1
i±(i)
1 ! dx V */*— 1
1 1
*(1 -f *)* x
We can therefore compute
the residue at
Example 4. The function (sin r)/* 4 has a pole of order 3 at z * 0 as the reader can
easily check with the aid of the theorem of this section. Hence the residue at z » 0
can be computed by using formula (12-5). It is simpler, however, in this case, to writ©
out the Laurent expansion in the neighborhood of t - 0 and obtain the residue from it.
Since sin t « z — («*/31) -{- (*V5!) ,
z 4 ■"*» 31* 5!
for |*| > 0.
It is dear from this that the singularity at * - 0 is a pole of order 3 with the residue
-1/31.
Example 5. The function
w- coB rh
has an isolated singular point at * —
that
cos u
1, This point, however, is not a pole, for on noting
u 2 u 4
1 ~2! + 4l
we conclude by the substitution u — l/(z — 1) that for |* - 1 1 > 0,
1 , 1 1
008 2 — 1 ~ 2!(z - l) 2 + 4!(z - l) 4
This is the desired Laurent expansion about * ■* 1. Since it has infinitely many negative
powers of t — 1, the point z «• 1 is an essential singular point. Inasmuch as the term
(z — l)" 1 does not appear in the expansion, the residue o_i at z » 1 is zero.
PROBLEMS
1. Obtain the Laurent expansions in the neighborhood of the singular points of the
following functions, and thus obtain the residues:
(«) , (JO «->'•*, («) j-z - t . C<0 - <•> 9 * </> A 1 ", ( b )
W j-Lj . » (W 004 *. w *w *•
1 -«*•
**
,w
«*
(*-!)*’
sue. 13] ANALYTIC ASPECTS 573
2. Whenever possible, determine the residues at the poles of the functions in Prob. I
by means of formula (12-5).
3 . Obtain the residues in Examples 1, 2, and 3 of this section by deducing appropriate
Laurent’s series.
4 . Prove the following theorem: If /(*) » g(z)/h(z) is the quotient of two functions
analytic at t - a such that g(a) * 0, h(a) » 0, and h'(a) ^ 0, then /(*) has a simple
pole at * » a with the residue g(a)/h'(a). Hint: Examine the quotient of the Taylor
expansions of g(z) and h(z) about t •* o.
5. Use the theorem of Prob. 4 to show that /(*) - cot * « cos s/sin t has simple
poles at * ■» ±hir, k ■■ 0, 1, 2, ....
6. Note that f(i) • 1/(2 - i) + l/(* - 1) has the Laurent expansion
« 1 «0 1
/(*) - z 5^1*" + E ~
£ j > 2 " +l „_1 *"
valid in the ring 1 < |i| < 2. This expansion has the term 1 /*. Does it follow that
* - 0 is a singular point of /(*) with the residue equal to 1?
13. Residue Theorem. Let f(z) be analytic in the given closed region
R bounded by C, except at the isolated singular points z » z\, z «* z 2l
Zb,. If these points zt are enclosed by circles F* {k = 1, 2, . . . , m ),
so that /(z) is analytic in the multiply connected region bounded by C and
the Tjt, we know that
<f c f{z) dz =» <f r M dz + ^ rj /(z) dz -| f- (^/(z) dz. (13-1)
But from (11-7), on setting n - -1, we see that
(a^) k ~d-6 f(z)dz (13-2)
2irt •' r*
where (a_i)t is the residue of /(z) at z = z*. We can thus write (13-1)
in the form
m
dz ~ 2*1 YL (<*-i )*• (13-3)
,/c Jt-i
The result embodied in this formula is known as the Residue Theokem:
The integral of /(«) over a contour C containing within it only isolated singular
points of f(z) is equal to 2ri times the sum of the residues at these points .
Inasmuch as the residues of /(«), as demonstrated in the preceding section,
can often be easily calculated, we see that formula (13-3) provides a simple
means for evaluating integrals of analytic functions with isolated singu-
larities.
Example 1.
Evaluate
f 1±L
Jc *(2 — *)
dz,
where C is the circle
1 .
The only singular point of the integrand enclosed by C is z - 0. In Example 1 of
Sec. 12 we saw that the residue of the integrand at i * 0 is Hence the value of the
integral is (2ri) H «■ ri. The value of this integral over any path C enclosing z m 0
and f * 2 is 2* i(H - %) « -2vt, since the residues at these points are H and —
574
COMPLEX VARIABLE
[CHAP. 7
J r «*
' — — dz over the circular path |*| •> 2.
C ** + 1
The residues of the integrand at z - t and z - — i were computed in Example 2,
Sec. 12. Hence the value of the integral is
2*i(~ -ir) m 2** sin 1.
\2 i 2% /
Example 3. Evaluate j cos <fc.
We saw in Example 5 of Sec. 12 that z « 1 is an essential singular point with the
residue zero. Hence the value of the integral is zero for every closed path C which
does not pass through z m 1. If z - 1 lies on C, the integral is improper and other
means have to be employed to determine its value.
PROBLEMS
1. Use results of Prob. 1, Sec. 12, to obtain values of the following integrals where C
is the circle |*I *» 2:
« «’ Lrh- «*> //“'-• « L }=r+ <•>
* — 2
2. Determine the residues of f{z) ■» — at z - 0 and z m 1, and thus evaluate
z{z — 1)
J r g ~ 2
— dz, where C is the circle \z\ « 2.
c *(* - 1)
f r -f 1
3. Evaluate the integrals / -r dz (» - 1, 2), where C\ is the circle \z\ «* 1 and
Ja z ~~ 2z
Cj is the circle |*| «* 3.
4. Find the value of /
Jc
circle 1*1 - 3.
i + l
F C (* — 2)*
dz, where (a) C is the circle |z| * 1, (6) C is the
14. Behavior of f(z) at Poles and Essential Singular Points. From
Laurent’s representation (12-3) of f(z) in the neighborhood of a pole z * a,
we easily conclude that | f(z) | becomes infinite as z — ► a. The behavior
of |/(r) | with an essential singularity at z = a is different because the
expansion (12-2) has infinitely many terms with negative powers of z — a.
While it is true that in this case | f(z) | as z — ► a is also unbounded, the
function j f(z) | oscillates as z — ■ » a. Indeed, it was shown by E. Picard
that in the neighborhood of an essential singular point, f(z) assumes any
preassigned value, with the possible exception of one value, infinitely many
times. A discussion of this would carry us too far in the study of analytic
functions, and we merely illustrate this behavior by an example. Since
P
liu
1
1 + - +
Z
1*1 > 0 ,
GEOMETRIC ASPECTS
575
SBC. 15]
f(z) » e 11 * has an essential singular point at z *® 0. We show that if 4
is any complex nxunber not zero, there are infinitely many values of z
in the neighborhood of z « 0 such that
e l/# * A, (144)
for on taking the logarithm of (14-1) we get infinitely many solutions
Log | A | -h i(<f> "b 2kr)
where 4> is the principal argument of A.
0, sfcl, ± 2 ,
GEOMETRIC ASPECTS
15. Geometric Representation. The usefulness of graphical representa-
tion of real-valued functional relationships in the familiar three-dimensional
space is too obvious to require emphasis. The customary mode of rep-
resenting real functions by curves and surfaces fails, however, when one
encounters functions of more than two independent variables. Thus, a
relationship u = f{x } y,z) containing three independent real variables x , y } z
requires a four-dimensionai space for geometric representation. Similar
difficulties arise when one attempts to represent graphically complex
functions w = /(z), with z « x + iy. For, to each pair of values (%,y),
there correspond two values ( u,v ) in w = u + iv, and in order to plot a
quadruplet of real values {u^v.x.y) we need a four-dimensional space.
However, a different mode of visualizing the relationship w ® f(z) which
utilizes two separate complex planes for the representation of z and w is
possible. The relationship w « f(z) then establishes a connection between
the points of a given region R in the z plane and another region R ' de-
termined by w = f(z ) in the w plane.
On separating w = f(z) into real and imaginary parts one obtains two
real functions
u « u{x,y) }
v = v(x,y),
(15-1)
which can be viewed as the equations of a transformation that maps a
specified set of points in the xy plane into another set of points (u,v) in
the uv plane.
We turn now to this mode of studying complex functions.
Example 1. Let w » z -f a, where a « h -f* ik is a complex constant.
We aet w *• u + * *• * + iy, and get
u+w**x+iy + h+ih
- te 4* W 4- i{y 4* k).
«76
Hence
COMPLEX VARIABLE
[CHAP. ?
(15*2)
u ** % *f k,
v-y + k.
Formulae (15-2) are the familiar equations defining a translation, and the relationship
w — * «f a can be visualised as representing a rigid displacement of points in the *
plane, where each point is moved h units in the direction of the x axis and k units in the
direction of the y axis.
Example 2. To study the function to *• as, where a is a constant, it is convenient to
use polar coordinates.
We set x m re m t to « pt**, a - Ae %< * and get
pe* - Are'< a +*\
Hence p ~ Ar, <t> » a + 0. (15-3)
We see from (15-3) that the modulus of w is got by multiplying the modulus of * by A.
Also the argument ^ of to is got by adding a constant angle a to the argument 9 of z.
We can visualise the transformation (15-3) as representing a stretching in the ratio A : 1
accompanied by a rotation through an angle a. A square with the center at the origin
in the z plane is thus deformed into a square, a circle of radius R is transformed into a
circle of radius AR, and more generally any figure is transformed into a similar figure
enlarged by the factor A, If A ■» 1, we have a pure rotation through an angle a.
The same conclusions can be reached (but less readily) by setting w « u + iv, z *
* 4* iy, o •» <*i -f tat and by deducing from w » az the transformation
u ** a x x ~ 02 y,
v - Ojx 4 - aiy ,
in cartesian coordinates.
Example 3. To study the relationship w *■ 1/z, * ^ 0, we again use polar coordinates.
On setting u> - pe** z m re a , we get pe i<f> « (l/r)e~ a , so that
1
P - -» </> « -0. (15-4)
It is clear from (15-4) that the unit circle |zf ■■ 1 is transformed into the unit circle
| to | ** 1 in the w plane. Since <t> ■» — 9, the corresponding points on these circles are got
by reflection in the axis of reals (Fig. 17). As the point A traces out the circle |z| «* 1 in
Flo. 17
GEOMETRIC ASPECTS
t
sue. 16]
677
the clockwise direction, the corresponding point A* in the w plane traces out the circle
|w| * 1 in the counterclockwise direction. Points in the interior of )*| » 1 are mApped
into points in the exterior of |tc| * 1, except that the transformation of the point
2 - 0 is not defined by w « 1/z. Points in the neighborhood of z « 0 map into points
at a great distance from the origin of the w plane, since p «* 1/r. To complete the
correspondence of points, we can introduce a new point w * » as the correspondent of
2*0. The point w * » is called the point at infinity . If we consider the inverse trans-
formation 2 * l/w, we see that w * 0 corresponds to z * «.
The reader can show that the equations of transformation defined by w * 1/z in
cartesian coordinates have the form
with the inverse
X
V
** + v 2 '
~x 2 + „ 4
u
V
y “ u s + «*
(15-5)
PROBLEMS
1. Discuss the transformations defined by (a) w * (1 -f i)z f (b) w « l/(z — 1),
( c ) w * 1 / 2 , (d) w «* 02 4* 6.
2. Show that every circle in the 2 plane maps by the transformation w « 1 ft into a
circle in the w plane if one considers straight lines as the limiting cases of circles. Hint:
Write the general equation of the circle in cartesian coordinates, and make use of (15-5).
3 . Show that the bilinear transformation
02+5
w wm ; , with ad — be 0.
C2 + d
can be decomposed into successive transformations 2 ' * cz -f d, z” «* 1 /z\ w * (o/c)
+ [(6c — ad)/c]z”, which are the type studied in Examples 1, 2, and 3. Then conclude
(see Prob. 2) that a bilinear transformation transforms circles into circles. Discuss the
case when ad — be * 0.
16. Functions w = z n and z
determined by the function
= y/w. Let us study next the mapping
w * z\ (16-1)
If we set 2 = re ** and w = pe we get
so that
pe** as r 2 e m ,
<f> « 20 .
(16-2)
It is clear from (16-2) that the upper half of the z plane maps into the whole
tv plane, for when s is in the upper half plane, the range of variation of 0
is 0 < 0 < t. Since <j> = 26, we see that the arguments of the corresponding
points in the to plane vary from 0 to 2r. Points on the upper half of the
COMPLEX VAHIA.BXJS
578
(chap, 7
circle \z \ * r map into the entire circle |m? 1 = r 2 (Fig. 18). The half
ray OA in the z plane maps into the half ray O' A' in the w plane. A
radial line OB, making an angle 0 with the x axis, goes over into a radial
line O'B', making an angle <£ = 20 with the u axis. The interior of the
quadrant OAC of the circle | z | * 1 maps into the interior of the semicircle
M m 1 in the upper half of the w plane with the boundary ABC going
over into the boundary A'B'C'. The segment OF of the negative real
axis in the z plane maps into the segment O'F' along the positive u axis.
To distinguish points on the positive u axis that correspond to points on
the ray OA from those on OF, we can imagine that the w plane is slit
Fig. 18
along the positive u axis and suppose that the points corresponding to OA
lie on the upper bank of the slit O' A* and that those corresponding to OF
lie on the lower bank O'F'.
The transformation of points determined by (16-1) can be visualized as
a fanwise stretching of the upper half of the z plane in which the sector
OAB opens into a sector O'A'B’ and the half circle OACF is deformed into
the whole circle \w\ = L The semicircles of radius r in the z plane go
over into full circles of radius p =* r 2 in the w plane. Points in the lower
half of the circle \z \ = 1 map into the whole circle \w\~ 1, inasmuch as
the replacement of 0 by d + * in (16-2) yields ~ 20 + 2ir. Thus, two
distinct points B and G with the arguments 0 and 0 + ir in the z plane
correspond to one and the same point B' in the w plane.
This is to be expected, since, on solving (16-1) for z, we get
z => Vw, (16-3)
which is a double-valued function. If we set w = pe'* in (16-3), we get
two values
2 » \/pe ,< * ,2) , z = Vpe’ 1( * /2) ' Hr! = (16-4)
For points along the u axis, the argument <f> = 0. Points on the upper
GEOMETRIC ASPECTS
SEC. 16]
579
bank of the slit O'A' in Fig. 18 correspond to z m Vp, and those of the
lower bank O'F' to z * — Vp* Thus, along the slit, z » \/w is a dis-
continuous function unless p =* 0.
The function
w «* z n y n a positive integer, (16-5)
can be studied in the same way. On setting z « re**, te « pe** we find
p = r w , <p ~ nd.
(16-6)
This time a wedge of angle 2 t/r in the 2 plane (Fig, 19) maps into the
whole of the w plane, and a circular arc ACB of radius R goes over into
a full circle |ip| = R n . An adjacent wedge OBD of angle 2r/n also maps
into the whole w plane. If we divide the z plane into a set of n adjoining
wedges, each of angle 2 ir/n, the entire z plane will be mapped into the w
plane n times.
Corresponding to a given point w ^ 0, there will be n values of z de-
termined by the n roots
z
u *
2 irk\ /
<t> 2irk\
( COS ~ +
] + i sin I
~ + )
L \ n
n / \
n n / J
(16-7)
with &*0, 1, — l. Each of these roots lies in one of the wedges
into which the z plane is divided.
Some further insight into the character of mapping by means of (16-1)
can be gained by studying the maps of lines u = const, v = const. If
we set z *= x + iy in (16-1), we find
u ~ x 2 ~ y 2 ,
v = 2 xy }
(16-8)
so that the lines u = const, v = const map into orthogonal hyperbolas
x 2 — y 2 « const, 2 xy = const. Some of these are shown in Fig. 20, in
which the corresponding points are labeled by like letters.
Figl 20
SBC. 17]
GEOMETRIC ASPECTS
• 581
17. Hie Functions w <*» e* and z *= log w. If we set
u> *» u -f tv sad
z ** x + iy in
w « c*,
(17-1)
we get
u + w ** e*~ Hi/ * e*(cos j/ + t sin y)*
Hence
u » e* cos y, v « e* sin y.
(17-2)
It follows from these equations that
u 2 + v 2 =» e 2 *,
v
- « tan y.
u
(17-3)
Accordingly, the lines x » const map into the circles u 2 + v 2 « const in
the w plane, and the lines y = const map into the radial lines v/u = const.
Since
e '+2kri m e t e 2hri „ ^ fc « Q, dbl, ±2, . . . , (17-4)
we see that w = c* has an imaginary period 1 2W. Hence, if the z plane is
divided into horizontal strips of width 2w, with the initial strip determined
by 0 < y < 2t (Fig. 21), the relations (17-4) ensure that the behavior
of w * e z in every strip 2kir < y < 2(k + l)ir, k = dbl, db2, . . is
identical with that in the initial strip. Consequently, we can confine our
attention to the behavior of w ~ e* in the initial strip 0 <J y < 2ir.
A segment AC of a straight line x «= x 0 in the initial strip maps by (17-3)
into a circle u 2 + v 2 *» e 2x °. The points A(x 0r O), C(x 0 ,2t) correspond to
the same point u * e 2x *, v * 0 on the u axis. The segment OP of the
1 As for real functions, we say that /(z) is periodic of period a if /(« + o) » /(*).
COMPLEX VARIABLE
[CHAP. 7
y axis maps into the unit circle u 2 + v 2 « 1, since along OP x ** 0; the
half strip x > 0, 0 < y < 2t, maps into the region |w| >1. If x < 0,
a segment such &sQR in Fig. 21 maps into a circle whose radius is less than
1. The half strip x < 0, 0 < y < 2w, goes into the interior of the circle
\w\ » 1. Points on the lines y ** 0, y = 2ir, forming the boundaries of the
strip, map into points on the positive u axis. If we slit the w plane along
the positive u axis, then the points on the upper bank of the slit correspond
to points on the line p0 and those on the lower bank to points on y * 2ir.
The interior of the rectangle OACP in Fig. 21 corresponds to the interior
of the ring between the circles u 2 + v 2 «= 1 and u 2 + v 2 ~ e x °.
We further note that a point moving along the x axis away from the
origin O in the positive direction has for its image a point in the w plane
that moves in the positive direction along the u axis away from the image
0 ( on the unit circle. A point moving away from O in the direction of the
negative x axis has for its image a point moving from O' toward the origin
of the w plane.
If we consider some definite point w 0 in the w plane, the equation
w 0 - e a (17-5)
has for its solution
z * log Wo = Log | w 0 1 + + 2Jtar), A; « 0, =fcl, =fc2, . . ., (17-6)
where is the principal argument of w 0 . All these values of z differ only
by the imaginary part, and therefore there is just one solution of (17-5)
in each strip 2 kw < y < 2 (k + 1)t. The function
z = logw
is therefore infinitely-many-valued. If we restrict our attention to the
slit w plane so that the argument <t> of w lies between 0 and 2tt, the mapping
from the w plane to the z plane wiU be single-valued with just one image of
log w in the fundamental strip 0 < y < 2ir of the z plane.
To study the map of w = log z we interchange the roles of the z and
w planes in the foregoing discussion. We remark in conclusion that inas-
much as all trigonometric functions of z are defined in terms of e *, a study
of the mapping properties of such functions is reducible to the study of
mapping by w * e az .
PROBLEMS
1. Discuss in detail mapping by the function w « **.
% Show that the function
w « a ^2 4- > a > 0,
maps the circles \z\ » const into confocal ellipses and the radial lines arg z «■ B » const
into confocal hyperbolas.
GEOMETRIC ASPECTS
SEC. 18]
583
8. Prove that sin z and tan z are periodic functions.
4. Show that the curves u(x t y) =* const, v(x,y) ** const in (16-8) intersect at right
angles (Fig. 20).
18 . Conformal Maps. We noted in Sec. 15 that the relationship w **
f(z) can be viewed as a mapping that sets up a correspondence between the
points of the z and w planes. If w = fiz) is analytic in some region R
of the z plane, and if C is a curve in R , there is a remarkable connection
between C and its image O' in the corresponding region R' in the w plane
(Fig. 22). Consider a pair of points z and z + Az on C, and let the arc
length between them be As — PQ. The corresponding points in the
region R f are denoted by w and w +• Aw, and the arc length between them
by As' = P'Q\ Since the ratio of the arc lengths has the same limit as
the ratio of the lengths of the corresponding chords,
As' I Aw I
lim — — lim -
A* —► 0 As Ar -► 0 I Az I
lim
Ac — * 0
Aw ;
dw
Az 1
dz
(18-1)
We shall exclude from consideration those points of R at which dw/dz = 0
because at such points the correspondence of values of z and w ceases to
be one to one. 1
Formula (18-1) shows that an element of arc through P, on being trans-
formed to the w plane, suffers a change in length such that the magnification
ratio is equal to the modulus of dw/dz at P. This ratio is the same for all
curves passing through P, but ordinarily it varies from point to point in the
z plane, since \dw/dz\ need not have the same value at all points of the
z plane.
We shall see next that the argument of dw/dz determines the orientation
of the element of arc As' relative to As. The argument 6 of Az (Fig, 22)
1 If dw/dz ® f'(z) 0 at some point P of H, then dz/dw «* 1 //'(«) is not defined at the
corresponding point P' for the inverse function z ~ F{w). Thus F(w) is not analytic at
P'. Indeed, it can be shown that a necessary and sufficient condition for the existence
of a unique differentiable solution of w ** f(z) at the point z * zq is precisely f'(zo) p* 0.
COMPLEX VARIABLE
584
{chap. 7
i a the angle made by the chord PQ with the positive direction of the a
axis, while the argument 6 l of Aw is the angle made by the corresponding
chord P f Q f with the u axis.
Hence, the difference between the angles 6' and 8 is equal to
arg Aw — arg Az
since the difference of the arguments of two complex numbers is equal to
the argument of their quotient. As Az — > 0, the vectors Az and Aw tend
to coincide with the tangents to C at P and C' at P' y respectively, and
hence arg dw/dz is the angle of rotation of the element of arc A s' relative
to As. It follows immediately from this statement that if Ci and C 2 are
two curves which intersect at P at an angle r (Fig. 23), then the correspond-
ing curves C\ and C* 2 in the w plane also intersect at an angle r, for the
tangents to these curves are rotated through the same angle.
A transformation that preserves angles is called conformal , and thus one
can state the following theorem :
Theorem. The mapping performed by an analytic function f(z) is con-
formal at all points of the z plane where fiz) ^ 0.
The angle-preserving property of the transformation by analytic func-
tions has many important physical applications. We shall indicate several
of these in the remaining sections of this chapter, and we merely note here
that a number of results deducible analytically from Sec. 15, Chap. 5,
follow directly from geometric considerations.
For example, if an incompressible fluid with a velocity potential $(x,y)
flows over a plane (so that v x = d$/dx, v y — d$/dy) y then it is known 1
that the streamlines x y y ) — const are (iirected at right angles to the
equipotential curves $(x,y) *= const.
1 See Sec. 15, Chap* 5 f and particularly Prob. 6 of that section.
SEO* 18] GEOMETRIC ASPECTS t 585
The orthogonality of the curves $ « const and ¥ » const in the z plane
follows at once from the conformal properties of transformations by
analytic functions. It was shown 1 that the functions # and W satisfy
the Cauchy-Riemann equations. One can therefore assert that $ and
are the real and imaginary parts, respectively, of some analytic function
w = f(z ) ; that is,
f(z) = <P(x,y) + i*(x,y).
But the curves # — const and ^ = const represent a net of orthogonal
lines (Fig. 24) parallel to the coordinate axes in the w plane, and they are
transformed by the analytic function w — $(x,y) + iSk(x,y) into a net of
orthogonal curves in the z plane.
We saw in Sec. 9 that the real and imaginary parts of every analytic
function f(z) =* u(x,y) + iv(x,y) are harmonic; that is, they satisfy La-
place's equation in the region where f(z) is analytic. Since solutions of
Laplace’s equation are demanded in numerous practical problems, analytic
functions serve as a useful apparatus for producing such solutions. For
example, if we take
w * u + iv =» sin z = sin (x + iy)
then u + iv « sin x cos iy + cos x sin iy
» sin x cosh y + i cos x sinh y .
The harmonic functions u = sin x cosh y f v * cos x sinh y are of special
interest in deducing solutions of Laplace’s equation in rectangular regions. 2
Further importance of conformal transformation by analytic functions
derives from the fact that a harmonic function remains harmonic when
subjected to such a transformation. If a function <fr(u,v) satisfies Laplace's
i See Eq. (15-10), Sec. 15, Chap. 5.
* See, for example, Sec. 20.
586
equation
COMPLEX VARIABLE
[chap. 7
d 2 <*> dV
du 2 ^ dv 2
0
(18-2)
in some region 72' of the uv plane, then <t> still satisfies Laplace’s equation,
in the appropriate region R of the xy plane, when the variables u, v in
4>{UjV) are related to x, y by an analytic function
w = u + tv = f(z). (18-3)
To see this, construct an analytic function
F(w) = <t>{u f v) + iypiuyv) (18-4)
by calculating the conjugate yp{u y v) of the harmonic function 4>(u y v).
The substitution from (18-3) in (18-4) yields
F\J{z)) = *{x,y) + i*(x,y), (18-5)
which is analytic in the region R of the xy plane into which the region
R f is mapped by (18-3). The function <t>(x,y), being the real part of the
analytic function F[f(z)] y is harmonic.
This property of the transformation of harmonic functions by means of
analytic functions is of the utmost practical importance; for, suppose that
we are required to find a solution <t>(v y v) of Laplace's equation (18-2) such
that on the boundary C' of some complicated region R ' in the uv plane,
4>{u y v) assumes specified values. If it should prove possible to find a func-
tion w = f(z) which maps the region R' conformally into some simple
region R (a circle, for example) in the z plane, it may be relatively easy to
determine the transform 4>(x,j/) of <t>(u y v) in the region R with proper values
of on the boundary C.
If $(x, 2 /) is so determined, the function <t>(u y v) can be obtained by re-
placing the variables in &(x,y) by their values in terms of a and v. It is a
remarkable fact, first discovered by Riemann, that every simply connected
region R ' (with more than one boundary point) can be mapped conformally
onto the unit circle | z | < 1 in such a way that the boundary C corresponds
to the circular boundary \z \ » 1.
We shall sketch this mode of solution of the Dirichlet problem in Sec. 21.
PROBLEMS
1. Obtain solutions of Laplace’s equation from (a) w « cos z, (b) w » e? f (c) w ■■
(d) w m log z, (e) w «* l/z.
2. Construct the conjugate harmonic functions v(z,y) for the following functions:
(«) u m cos x cosh y ; (b) u « e* cos y; (c) u » y + e cos y; (d) u » cosh x cos y.
& Examine the mapping by w ** z 2 and w » z 3 at z 0. Is it conformal at z « 0?
Examine the behavior of the maps of rays issuing from * » 0. What are the ratios of
magnification of the arc elements at z 1, z «■ 1 -f i t * m t?
SBC. 19]
APPLICATIONS
587
?
APPLICATIONS
19. Steady Flow of Ideal Fluids. We discussed the flow of nonviscous
incompressible fluids in Sec. 15 of Chap. 5, where we introduced the con-
cept of the velocity potential $(x,y) and the stream function ^(x,y).
These functions were shown to be related by the Cauchy-Riemann equa-
tions
34> <H> d *
dx dy dy dx
It follows from (19-1) that
F(z) *= Hz,y) + M(x,y)
(19-1)
(19-2)
is an analytic function of a complex variable z = x + iy. We shall call
F(z) the complex potential and show that its derivative is related simply to
the velocity vector v = V$ of the fluid particles.
By (4-5),
dF . a *
dz dx dx
(19-3)
and, since v = V4>, so that
a$> d$ d *
V x as , V ss xs »
dx dy dx
we can write (19-3) in the form
dF
— * V x — iv v . (19-4)
dz
We shall see in Bee. 21 that because of the simplicity of the complex-
variable theory in comparison with the theory of real functions, it is often
simpler to calculate the complex potential F{z) than it is to determine
either of the real functions $(x,y) or 'f'(x,y). This determination depends
on certain so-called boundary conditions, which are now to be described.
We first recall 1 that since v = V4> is orthogonal to the curves $(x,y) = const
and these curves are orthogonal to the curves x,y ) = const, the vector
v is tangent to the curves ^(x,y) = const. Hence these curves, called
streamlines , are the paths of the fluid particles. When a sheet of fluid flows
past an impenetrable obstacle C (cf. Fig. 25, Sec. 20), the fluid particles
must flow along the obstacle and hence the boundary C must coincide with
one of the streamlines. Thus the equation of one of the streamlines, say
*(x,y) « k, (19-5)
must coincide with the equation of the boundary C.
1 See Sec. 3, Chap. 5, and Sec. 18 of this chapter.
COMPLEX VARIABLE
688
[chap. 7
To determine &(x,y) we must then seek a solution of Laplace's equation
V 2 ^(x f y) •* 0
(19-6)
in the region exterior to the obstacle, which is such that on the boundary C
takes on a constant value.
This suggests an indirect mode of solution of the steady-fluid-flow
problems. One examines the shapes of curves ^(x } y) = const for various
harmonic functions ^(x,y) } and if a particular curve ^(x } y) = k coincides
with the boundary C of an obstacle of special technical interest, then the
function ¥( z,y ) solves a special problem.
It follows from these remarks that any streamline ^(x,y) = const can
be regarded as a rigid boundary of some obstacle.
Instead of determining the stream function we can equally well
determine a harmonic function $(x,y) which on the boundary C satisfies
the condition
d$
— - 0, (19-7)
an
where n is the unit normal to C, for the statement that the obstacle is
rigid implies that the normal component v n of v must vanish along C,
since no particles of fluid can cross C. But v n = n • v, and since v ** V$
and
— « n-V# » v n ,
dn
we see that (19-7) must hold on C.
It should be noted that we have assumed in the foregoing that there are
no sources or sinks in the region and that the fluid is incompressible.
Moreover, the flow is irrotational, and hence $(x,y) and ^(x^y) are single-
valued functions. These considerations can be extended to the more
general situation in which circulation is present. However, as we shall see
from examples in the following section, the complex potential F(z) will
then no longer be a single-valued function of z .
PROBLEM
Deduce from the boundary condition (19-7) that dV/ds ■* 0 along C, so that ¥ »
const on C. Hint: Note that d$/dn *• (d$/dx)(dx/dn) 4- (d$/dy)(dy/dri). Make use of
(19-1), and observe that dx/dn *» dy/d* t dy/dn « -dx/ds on <7.
20. The Method of Conjugate Functions. We observed in the preceding
Section that every analytic function F(z) * u(x,y) + iv(x,y) can be
associated with some flow pattern of an incompressible fluid. In fact,
every such function determines two flow patterns, since either of the
APPLICATIONS
sec. 20]
589
harmonic functions u(x y y), v(x,y) can be regards as determining the stream-
lines.
The simplest example of an irrotational flow is furnished by the function
F(z) « cz as $ + i¥ }
where c is a real constant. Since z » x + iy , we have $ * cx> ¥ ** cy,
and thus the curves ^ — const are straight lines parallel to the x axis.
The formula (19-4) for the velocity of the fluid yields v x « c, v v « 0, so
that the flow is parallel to the x axis. Since div v « 0 and curl v 0,
there are no sources or sinks in the region and the flow is irrotational.
As a more interesting example, consider
-4 + t)-
4> + i% c > 0, a 2 > 0.
If we set z * re 1 6 in (20-1), we easily find that
/ a 2 \ ( a 2 \ .
& « c I ■ r H 1 cos 0, = c I r 1 sin 0.
For r » a, we have ^ * 0, and hence the boundary of the circle r » a
is a streamline. The pattern of streamlines is shown in Fig. 25 by the
solid lines, and the curves $ « const are indicated by the dashed lines.
COMPLEX VARIABLE
590
{chap. 7
This flow pattern corresponds to a flow around a circular cylinder,
velocity components are determined from
F f (z)
The
It is easy to verify that div v = 0 and curl v = 0, so that the flow is
irrotational. The points for which v x = v v « 0 are z = ±a. These are
called the stagnation points .
Let us investigate next the flow pattern determined by
F(z ) = c log z = u + ivy z = re %6 ) (20-2)
where c is a real constant.
If we consider only the one branch of this multiple-valued function for
which 0 < 6 < 2ir, we get
F(z) = c(Log r + id),
so that u = c Log r, v — c6 y 0 < 6 < 2tt.
If we set ^ « cd, then the streamlines *= const are the radial lines
and the curves <$> = const are circles c Log r = const (Fig. 26). By Eq.
(15-1) of Chap. 5, the amount of the fluid crossing per second any closed
curve C is
V == J c (v x dy -
d'f’.
But ¥ * cd, bo that
V = c f d6.
Jc
APPLICATIONS
591
SEC. 20]
i
This integral vanishes for any path that does not enclose the origin. If
the origin z » 0 is within C, then V «= 2vc. Hence for c > 0, the flow is
outward and we have a source of strength 2ttc at the origin. For c < 0,
we have a sink of the same strength. Thus, div v « 0 at all points except
z = 0.
The circulation J is given by the integral 1
J °*f c (v*dx + v v dy) = f d*
and since <t» = c Log r, J = 0 and the flow is irrotational.
If, however, we take $ * cO and ^ = c Log r, the roles of the curves
= const and = const in the preceding discussion are interchanged.
We thus conclude that for this flow the circulation J — 2rc if C encloses
the origin. This corresponds to the situation described as a point vortex
at the origin.
The reader will find it of interest to study the function
4> + = c ^z H ^ — id log z, a > 0, c > 0,
for which 'k = const when \z\ = a. The function Sk( x 7 y ) represents a
flow around a circular cylinder r « a with the circulation 2*rc'.
As further examples of functions yielding useful solutions of interesting
physical problems consider the following:
1. The Transformation w = cosh z. Here
Thus,
e* -f- e~
w «
2
cosh z.
so that
or
u + iv « cosh (x + iy) = cosh x cosh iy + sinh x sinh iy
*» cosh x cos y + i sinh x sin y ,
u ** cosh x cos y t
v » sinh x sin y ,
u
,2
+
cosh 2 x sinh 2 x
u 2 v 2
cos 2 y sin 2 y
1 ,
I.
1 See Sec. 10, Chap. 5.
m
COMPLEX VARIABLE
[chap* 7
This transformation is shown in Fig* 27, and it may be used to obtain the
electrostatic field due to an elliptic cylinder, the electrostatic field due to
a charged plane from which a strip has been removed, the circulation of
liquid around an elliptic cylinder, the flow of liquid through a slit in a
plane, etc.
The transformation from the » plane to the w plane may be described geo-
metrically as follows: Consider the horizontal strip of the z plane between
Pig. 27
the lines y — 0 and y « t, and think of these lines as being broken and
pivoted at the points where x » 0. Rotate the strip 90° counterclockwise,
and at the same time fold each of the broken lines y «* 0 and y =* r back
on itself, the strip thus being doubly 1 ‘fanned out” so as to cover the
entire w plane.
It is interesting to note that this same transformation w = cosh z can
be used to solve a hydrodynamic problem of a different sort. When liquid
seeps through a porous soil, it is found that the component in any direction
of the velocity of the liquid is proportional to the negative pressure gra-
dient in that same direction. Thus, in a problem of two-dimensional
flow the velocity components (u,v) are
u
v
k d ±
dy
If these values are inserted in the equation of continuity, namely, in the
equation
du dv
the result is
dx
d 2 p
9 ?
0 .
APPLICATIONS
sec. 20]
/
m
Suppose, then, one considers the problem of the seepage flow under a
gravity dam which rests on material that permits such seepage. One
seeks (see Fig. 28) a function p that satisfies Laplace’s equation and that
satisfies certain boundary conditions on the surface of the ground. That
is, the pressure must be uniform on the surface of the ground upstream
from the heel of the dam and zero on the surface of the ground down-
stream from the toe of the dam. If we choose a system of cartesian co-
ordinates u } v with origin at the mid-point of the base of the dam (Fig.
28) and u axis on the surface of the ground, then it is easily checked that
the function p(u,v) * poy(u f v)/v, where
w = u + iv = a cosh (x + iy),
satisfies the demands of the problem. In fact, it was seen in the study of
the transformation w = cosh z that the line y = v of the z plane folds
up to produce the portion to the left of u = —1 of the u axis in the w
plane and the line y ~ 0 of the z plane folds up to produce the portion to
the right of u » +1 of the u axis. The introduction of the factor a in
the transformation merely makes the width of the base of the dam 2a
rather than 2. These remarks show that p(u,v) reduces to the constant
7T on the surface of the ground upstream from the heel of the dam. If the
head above the dam is such as to produce a hydrostatic pressure po, one
merely has to set
, , vovM
p(u,v) «
One can now find the distribution of uplift pressure across the base of the
dam. In fact, the base of the dam is the representation, in the w plane,
COMPLEX VARIABLE
594
[chap. 7
of the line x « 0, 0 < y < *, of the xy plane. Hence, on the base of the
dam the equations
u — a cosh x cos
reduce to
v ~ a sinh x sin y,
u ~ a cos y,
v — 0,
so that p(u,0) = — cos 1 * - •
x a
This curve is drawn in the figure. The total uplift force (per foot of
dam) is
Po [* a w
P = — / cos ~ du ~ p 0 a t
TT J — a (1
which is what the uplift pressure would be if the entire base of the dam
were subjected to a head just one-half of the head above tin 4 dam or if
the pressure decreased uniformly (linearly) from the statu 1 head p 0 at the
heel to the value zero at the toe. The point of application of the resultant
uplift is easily calculated to be at a distance b = 3o/4 from the heel of the
dam. 1
2. The Transformation w = z 4 e*. One has
u + iv — x + i y + ( ,x f xv
= x + iy + e*(eo$ y + i sin y),
so that u = x + c* cos ?/,
v — y + e x sin y.
This transformation is shown in Fig 29 If one considers the portion
of the z plane between the lines y - ±ir, then the portion of the strip to
the right of x = —1 is to be “fanned out” by rotating the portion of
y « +1 (to the right of x ~ — l) counterclockwise and the portion of
y * —1 (to the right of x = —1) clockwise until each line is folded back
on itself. This transformation gives the electrostatic field at the edge
of a parallel-plate condenser, the flow of liquid out of a channel into an
open sea, etc.
1 Some material in Secs 18 to 20 is taken by permission from a lecture by Dr. Warren
Weaver printed in the October, 1032, issue of the American Mathematical Monthly.
SBC, 21]
APPLICATIONS
595
Fig. 29
PROBLEMS
1. Study the flow determined by the complex potential w » cz* in a quadrant x > 0,
y > 0. The function * «■ 2 cxy can be associated with the flow of fluid around a comer.
2. Study the flow determined by the complex potential w « c sin % in the semi-
infinite region |x| < ir/2, y > 0.
21. The Problem of Dirichlet The procedure for reducing solutions of
physical problems described in the preceding section is indirect. It depends
on the examination of various harmonic functions that satisfy the boundary
conditions appearing in specific physical situations.
In this section we outline a general procedure for constructing harmonic
functions which assume preassigned boundary values. Thus, let it be
required to determine a solution of Laplace’s equation
V 2 $>(x,y) = 0 (21-1)
which on the boundary C of a given simply connected region R assumes
preassigned continuous values
<i> - (21-2)
The variable « in (21-2) may be thought to be the arc-parameter 8 measured
along C from some fixed point.
The boundary-value problem characterized by Eqs. (21-1) and (21-2)
is known as the Dirichlet problem, and it can be shown that the solution
of it exists and is unique whenever the boundary C is sufficiently smooth.
These conditions are usually met in physical problems.
We first outline a solution of this problem for the case when the region
R is the unit circle \z\ < 1 and later indicate how this solution can be
596 COMPLEX VARIABLE [CHAP. 7
generalized to yield a solution of the Diriehlet problem for an arbitrary
simply connected region with the aid of conformal mapping.
Thus, let it be required to construct in the circle J z j < 1 a harmonic
function $(x,y) such that on its boundary y (Fig. 30)
Hx,y) = /(<?), (21-3)
where f(6) is a specified function of the polar angle 0.
Instead of determining <t>(x,y), it proves more convenient to determine
an analytic function
F(z) « ${x>v) + &(?,V) 9 [*I < 1 (21-4)
whose real part takes on preassigned values (21-3) and then compute
$(x,y) by separating F(z) into its real and imaginary parts. Now, since 1
F(z) + F(z) ~ 2$>(x,y), we can write the boundary condition (21-3) in the
form
F{£) + 2/(0) (21-5)
where f « e %9 represents the values of z — re 10 on the boundary y. If
1 dr
We HOW muUipW both members oi (2\-5) by , where z is an
2iri { - z
interior point oi the circle, and integrate over y, WB
1 We use bars to denote the conjugate values, so thatTifij « <*>(*, y) -
* P rov ® d*&t the conditions (21-5) and (21-6) are equivalent, one must impose
certain continuity restrictions on /($) usually met in the physical problems, See, for
example,!. S. Bokoimkoff, “Mathematical Theory of Elasticity,” 2ded n p. 143, McGraw-
Hill Book Company, Inc., New York, 1956,
EC. 21]
applications
597
2iri J y f ~ z 2t% h f — z jri *y £ — z
dt.
( 21 - 6 )
By Cauchy's integral formula, the first integral in the left-hand member
>f (21-6) is eq ual to F(z). We show next that the second integral has a
constant value F(0) as long as | z\ < L On expanding F(f) in Madaurin’s
series, we get
F(z) - F( 0) + F'(0)z + - F"(0)z 2 +•*•+- F (n) (0 )z n +•■■ (21-7)
2! n!
which is convergent for all \z\ < 1, since F{z) is assumed to be analytic
in \z\ < 1.
If we set z =* f in (21-7) and form the conjugate F(f), we get
+ + ^^(o) ?+*•*+
2 ! «!
But on the circle 7 , f = “ \/e* *= 1/f, so that
W ) + + ( 21 - 8 )
£ 2! r n! £
The substitution of this series in the numerator of the second integral
in (21-6) then yields a series of integrals of the form
1 1 r F n ( 0)
/
n! 2 iri '7 (£ — z)£ n
0,1,
But the application of the residue theorem shows that these integrals
vanish for n > 1, and for n * 0 we get
2iri J y f — z
F( 0) = a 0 ~ t&o.
Thus (21-6) can be written in the form
F(z)
i f m
ri 'y £ — z
d£ — Oo -f- ib 0,
(21-9)
where ao + tho *= F(0).
The real part oo of F( 0) can be determined explicitly in terms of the pre-
scribed values f(6) on y, for on setting z *= 0 in (21-9), we get
i r m
598
and therefore
COMPLEX VARIABLE
(CHAP. 7
But f
Oo =S=
1
2 iri
e % 6 , so that df/f — i dS and hence
a Q
1 f2*
— ) m de.
2tt j o
( 21 - 10 )
Accordingly, the real part of F(z) is determined uniquely when f(6) is
known. The real part of F(z) is the desired harmonic function 4 >(j %i/).
Since f « e l6 } f(0) can be expressed as a function of f, say g( f), and we
see that the integral in (21-0) has the form
Integrals of this type can frequently be evaluated in closed form with the
aid of the theory of residues.
Formula (21-9) thus solves the general Dirichlet problem for the cir-
cular region.
We indicate next how the Dirichlet problem for an arbitrary simply
connected region R can be solved when the function
w * w(z) (21-11)
mapping the region R in the complex w plane conformally onto the circle
\z \ < 1 is known. Let w — u + iv ; then the desired harmonic function
&(u,v), assuming the prescribed values
*(u,v) - *00 (21-12)
on the boundary C of R } is the real part of some analytic function
$(w) s v) + (21-13)
On substituting in 2F(ic) from (21-11), we get
&M*)] « F{z),
which is analytic in the circle \z\< 1.
The values of the real part of F(z) on the boundary y of the unit circle
are known, since the values of 4>(u,v) on the boundary C are specified by
(21-12) and the points on C are mapped into points on y by (21-11). We
can thus write the boundary condition (21-12) in the form
$ = f{6) on y.
The substitution of f($) in formula (21-9) then yields F(z)> To obtain
the desired function 4>(u,c), we must calculate the real part of SF(ic), which
SEC. 22] APPLICATIONS f 599
can be determined from F(z) by expressing z in terms of w with the aid
of (21-11).
It is clear that the solution of the problem of Dirichlet for an arbitrary
simply connected domain hinges on the construction of a suitable mapping
function (21-11). The fact that such a function exists is guaranteed by
Riemann’s theorem mentioned in the concluding paragraphs of Sec. 18.
During the past 30 years considerable attention has been given to the
problem of developing effective methods for constructing conformal maps
for simply connected domains. 1 A formula for conformal mapping
of a polygonal region on the unit circle (or alternatively, in the upper
half of the complex plane) has been supplied 2 by H. A. Schwarz (1843-
1921) and E. B. Christoffel (1829-1900).
During recent years extensive applications of complex variables to
broad classes of problems in the theory of elasticity have been made. 1
PROBLEMS
1. Use formula (21-9) to compute harmonic functions «J>(.r y y) in the circular region
x 2 + ?/ 2 1, which assume on its boundary the following values: (a) 4> « x 2 -j- y 2 t
(6) <f> ~ x 2 — y 2 , (c) <*> * cos* 0, where 0 is the polar angle. Hint: Note that x x /^{z -{- 2),
y =» (1/2 0(2 — 2) and that on the boundary of the unit circle 2 = 1 /z.
2. Set z » m r( cos <t> -f i sin </>), f « e t$ « cos 0 -f i sin 0 in (21-9); take account
Of (21-10); and stum that the real part <t> of F(z) is
w - 1 r~-jL=*m*
2ir Jo 1 — 2r cos (0 — < /») -f r 2
This formula, giving the values of harmonic function <f> at every interior point ( r,<f > ) of
the unit circle in terms of the assigned boundary values /(0), is known as Poisson's inte-
gral formula . (Cf. Chap. 6, Sec. 12.) Because of the difficulty of evaluating real inte-
grals, this formula is generally less useful than the Schwarz formula (21-9).
22. Evaluation of Real Integrals by the Residue Theorem. Formula
(21-9) and the problems in Sec. 21 suggest the use of contour integration of
complex functions in the calculation of certain real integrals.
Thus, consider a real integral
r2r
I F ( sin 0, cos 0) dO
J o
(22-1)
1 There is a vast literature on this subject, and we cite only a book by L. V. Kantoro-
vich and V. I. Krylov, “Approximate Methods of Higher Analysis,” Groningen, 1958,
containing a comprehensive survey of the problem in chap. 5. A useful catalogue of
mapping functions is contained in the “Dictionary of Conformal Representation,”
Dover Press, New York, 1952, compiled by H. Kober.
* This formula is contained in most books on complex- variable theory. See, for exam-
ple, R. V, Churchill, “Introduction to Complex Variables and Applications,” chap. 10,
McGraw-Hill Book Company, Inc., New York, 1948.
1 See Sokolnikoff, op. cit.
COMPLEX VARIABLE
600
[chap, 7
in which F is the quotient of two polynomials in sin $ and cos 0. The
evaluation of such integrals, as we shall presently see, can be reduced to
the calculation of the integral of a rational function of z along the unit
circle |«| * L Since rational functions have no singularities other than
poles, the residue theorem (13-3) provides a simple means for evaluating
integrals of the form (22-1).
We set z » e 40 , so that dz * e^i dd
dz
or dd « (22-2)
iz
and we recall Euler's formulas,
z | z ^ z — z~~~^
cos 0 «, , s i n $ = (22-3)
2 2 i
On inserting from (22-2) and (22-3) in (22-1) we get the integral
f c «(*) dz
(22-4)
in which R(z) is a rational function of z and C is the circular path \z\ = 1.
If the sum of the residues of R{z) at the poles within the circle \z\ < 1
is denoted by Sr, the residue theorem yields j R{z)dz - 2 in Sr, so that
f2r
F(mn 0, cos $) dd = 2 riSr. (22-5)
Example L As a specific illustration of this method of calculating integrals of the
type (22-1), consider
/
-j- a sin 9
0 < a < 1.
(22-6)
On making substitutions in (22-6) from (22-2) and (22-3), we get the integral
/
f * — -
Jc tz{ 1 + «(* — z l )/2i]
2 r dz
a Jc ** 4" ( 2i/a)z — 1
where C is the circular path \z\ « 1.
Since the roots of z 7 4* (2 i/a)z — 1 » 0 are
z\
(1 - VI - a 1 ),
-1(1 + Vl -e?),
(22-7)
( 22 - 8 )
2 f dz
a Jc (z - *l)(z - *»)
we can write (22-7) as
/
(22-9)
sbc. 22]
APPLICATIONS
601
But it is dear from (22-8) that for 0 < a < 1 we have ]*i| < 1 and |*a| > 1, so that
only one pole * * z% of the integrand
B(z) m 1
W <* - *i)(* - **)
lies within the unit circle. The residue of R(z) at * » by (12-6), is
1
r ■» lim R(»)(i - «i) -
«-> H *1 — *t
which, on noting (22-8), yields
2tVl - a*
By the residue theorem, the value of (22-9), which is the same as that of the integral
(22-8), is
/ - - 2«r - - ^
« V l - a 8
The reader can verify by the same method, or by setting 6 » <p — ir/2, that
r 2x de _ r 2 r d$ _ 2x
Jo 1-fa COS ^ Jo 1 + a sin (9 y/l'—aP #
0 < a < 1. (22-10)
The infinite integral
/ °° /(*)
( 22 - 11 )
-00 g{x)
in which /(x) and g(x) are polynomials in x, can also be evaluated by
calculating the residues. It should be noted that the integral (22tll)
converges if, and only if, 1 g(x) = 0
has no real roots and the degree
of g(x) is at least two greater than
that of /(x).
Now, consider the complex ra-
tional function
f(z)
R(z) - ^ (22-12)
g(z)
which, obviously, assumes along the
real axis the same values as the
integrand in (22-11). By hypothesis g{z) « 0 has no real roots; hence no
poles of R{z) lie on the real axis. We form the integral
f R(z) dz — ( — dz
Jc Jc g(z)
where the path C is the boundary of the semicircular region in the upper
half of the z plane shown in Fig. 31. Since ail roots of g(z) lie at a finite
1 This follows directly from the usual tests on convergence of improper integrals.
See Chap. 2, Sec. 8.
602
COMPLEX VARIABLE
(CHAP. 7
distance from the origin, we can take the radius R of the semicircle Cr
so great that all poles of R(z) = f(z)/g{z), in the upper half of the z plane,
lie within the semicircle. If the sum of the residues at these poles is 2r,
the residue theorem yields
r /(*) r R fix) r /(*) . v
/ — » / dx + / dz = 2m£r.
Jc n(z) J-~Ra(x) ^c R o(z)
(22-13)
' g{z) J - R g(x) " ' JcRgiz)
We show next that when the degree of g(z) is at least 2 greater than
that of f(z), the integral jT — dz —> Q as R », bo that formula
(22-13) then yields
fix)
r
J —30
sO)
-dx = 27rf2r.
For proof, set z = in R(z) = f(z)/g(z) t and note that
m
0(z)
M
W
M const,
when R is sufficiently large. Hence, by (5-8)
I ^
1 ■I'—'s
<f
M
1 j Cr g(z) 1
~ J Cr
R 2
M Mir
(22-14)
from which it follows that the integral over Or tends to zero as R — ► ».
Thus under the stated restrictions on f{z) and g{z) ensuring the convergence
of (22-11), formula (22-14) is true.
An improper integral like (22-11) should be understood in the sense
/ Rz r/ r \
—4 t/x (22-15;
g(f)
Hi -* »
where R\ and Rt approach infinity in any manner. However, the method of calculation
indicated in the text actually gives
lim
R — > w»
[
1 m
R <7(^1
dx
(22-15o)
so that «* R 2 in (22-15). The expression (22-3 5a) is termed the Cauchy principal
value of (22-15). If (22-15) exists (as in the case considered in the text) then obviously
(22-15a) exists and has the same value. But (22-15 a) may exist when (22-15) does not;
for example, take f(x) » x, g(x) « 1 + x 2 .
Example 2. To illustrate the use of formula (22-14) consider an elementary integral,
L
« ix
1 +x 2
m
1 _ 1
l +2*
Here
SEC* 22] APPLICATIONS 603
so that the only singularity of R(z) in the upper half plane is the simple pole at s «* u
Since the residue of R(z) at * «* i is l/2t, formula (22*14) yields
r d* . 1
I ; 9 ** 2lrt -- ** 7T.
1 + i* 2i
The essential considerations that have led us to formula (22-14) are:
1. The integral over the semicircular boundary Cr in (22-13) approaches
zero as R °o.
2. The singularities of the integrand in the upper half of the z plane are
isolated and are at a finite distance from the origin.
3. There are no singular points on the real axis.
Clearly, the same procedure can be used to evaluate integrals of the form
P F(x) dx
J — -00
by computing j c F(z) dz as long as the integrand F{z) satisfies conditions
1, 2, and 3. Occasionally, a slight modification of the procedure
outlined above can be used when | F(z) | is not sufficiently small in the
upper half of the z plane, so that the condition 1 is not fulfilled by F(z ) .
We illustrate this in the following example.
Example 3. Evaluate
r
J — OC
COS X
a 2 -f x 2
dx,
a > 0.
If we take F(z ) « (cosz)/(a 2 -f z 2 ), the method outlined above cannot be applied
directly, since |cosz| = l A\e u 4~ becomes infinite when z -+ » along the y axis.
However, since cos x is the real part of e**, we can write
cos x
a 2 •+• x 2
dx
e vx
dx
(22-16)
where Re stands for the “real part of. 0
Now, if we take
(22 ' 17)
then |e w | - | «•(*+*> | » | e ~v| < 1 if y > 0.
Thus, F(z) in (22-17) is bounded in the
upper half of the z plane, and there is no
difficulty in showing that / F(z) dz — ► 0
Jcr
as R — ► <». Moreover, F{z) m (22-17)
has only two singular points, which are
poles at z\ « ia and zz — ia . Only
one of these, z\ * ia, lies in the upper half
plane. Accordingly,
f - 2
Jc or + *
if C is the boundary of the semicircle (Fig. 32) and R is sufficiently large to include the
point 2 « t\.
604
COMPLEX VARIABLE
(CHAP. 7
Now the residue of (22-17) at z — ia is e~*/2at t and since j P(t) dx — * 0 as R — * «,
we conclude from (22-18) that c *
r e u d* . f e a dx „ .
j a 2 -f * 2 a 2 + x 2 ' 2ai
This result is real, and hence the integral in (22-16) is
„
a
L
-,dx
Xt
a
* a 2 -f x 2
Inasmuch as the integrand in (22-19) is even, we conclude that
J T;
o a 2 4* x 2
sdx
"2a"
(22-19)
( 22 - 20 )
PROBLEMS
i:
f Zv
1. Use relations (22-2) and (22-3) to write the integrals j j
2t fa ''
UV * it e /rtn i i a ii i j * » * t
d0
$4 H- sin 0 an< *
in the form (22-4) and evaluate the resulting integrals by the residue
'o 6 4- 3 cos 0
theorem. Check your calculations by formula (22-10).
r* r de 2x
% Show that / — -Tj * jrn . 0 < a < 1.
Jo (14a cos Qj (1 - a 2 )*
f* x 2 dx r
3. Show that
4 . Beferring to Prob. 2, show that
6* Show that:
f.
o (a -b cos 0) 2 (a 2 — 1)^ ’
\/2
if a > 1.
r° dx 1 r dx tV 2
* Jo 1 + x 4 2 J->«, 1 -f x 4 4
m r X 2 dx X
(6)
/IT w
if a > 0.
r* cos ax
(e) / — — s dx ~ -e
Jo 1
4* x*
6 . Show that
7, Show that
r
COB xdx X
iTM 2 ’ a* - ’
£
dx x (2n — 2)1
(1 + i 2 ) ft " 2 2 "~* [(n — - 1) !]* ^
if n is a positive integer. #*n<; The residue of (1 -J~ z 2 )~ n at * «* « is
— n(n 4* 1) . • ■ (2n — 2) .
- 1,
(n - 1) !2 ,n ~ 1
One way of seeing this is to let t m z - t, so that (1 4- **)~* - (il)~ ft 2~ fl (l — J#*)'’
The coefficient of 1/t is easily found by use of the binomial theorem.
t
CHAPTER 8
PROBABILITY
Fundamentals of Probability Theory
1. A Definition of Probability 609
2. Sample Space 612
3. The Theorems of Total and Compound Probability 612
4. Random Variables and Expectation 622
5 Discrete Distributions 627
6. Continuous Distributions 631
Probability and Relative Frequency
7. Independent Trials 637
8. An Illustration 641
9. The Laplace-de Moivre Limit Theorem 6^4
10. The Law of Large Numbers 650
Additional Topics in Probability
11. The Poisson Law 654
12. The Theory of Errors 658
13. Variance, Covariance, and Correlation 6611
14. Arithmetic Means 667
15. Estimation of the Variance 669
607
. . la thSorie des probability riest que le bon sens confirm &
par le calcul” — Laplace .
There is no branch of mathematics that is more intimately connected
with everyday experiences than the theory of probability. Recent de-
velopments in mathematical physics, moreover, have emphasized the
importance of this theory in every branch of science. A knowledge of
probability is required in such diverse fields as quantum mechanics,
kinetic theory, the design of experiments, and the interpretation of data.
A recently developed branch of mathematics known as operations analysis
applies probability methods to questions in traffic control, allocation of
equipment, and the theory of strategy. Cybernetics, another field of
recent origin, uses the theory to analyze problems in communication and
control. In this chapter on probability the reader is introduced to some of
the ideas that make the subject so useful.
FUNDAMENTALS OF PROBABILITY THEORY
1. A Definition of Probability. The idea of chance enters into everyday
conversation: “It will probably rain tomorrow,” “There may be a letter
for me at the office,” “I probably won’t get double six on the next throw.”
It is often possible to assign a numerical measure to the notion of proba-
bility which these statements illustrate. Such a measure, however, must
take account of the speaker’s state of knowledge. For instance, in the
second statement the mailman may know that a letter is there, since he
put it there himself. His measure of probability and mine are therefore
not the same. Probability for me is based on my knowledge, and proba-
bility for him is based on his.
From this viewpoint (which is one of several possible viewpoints)
probability is a measure of ignorance. In simple cases the state of ignorance
609
610
PROBABILITY
[CHAP. 8
can be accounted for, and probability can be defined as follows: We agree
to regard two events as equally likely if our ignorance is such that we have
no reason to expect one rather than the other. For example a 4 or a 6
is equally likely when a true die is tossed; heads and tails are equally
likely in a toss of a symmetric coin; aec of hearts and ace of spades are
equally likely to be drawn from a shuffled deck.
In the latter example how shall we measure the probability that the
card drawn will in fact be the ace of hearts? We say that there is “one
chance out of 52” and define the probability, accordingly, to be If
it is required only that the card be an ace, common sense suggests that
the probability should be four times as great, for there are four aces,
equally likely, and only one ace of hearts. Now, the value %,<i is, indeed,
the probability that a card drawn at random is an ace. Reasoning in this
way, we are led to the following definition:
Definition. Suppose there arc n mutually exclusive, exhaustive , and
equally likely cases. If m of these are favorable to an event A , then the proba-
bility of A is m/n .
The term mutually exclusive means that two cases cannot both happen
at once; the term exhaustive means that all possible cases are enumerated
in the n cases. There is seldom difficulty in seeing that these conditions
are satisfied, but careful analysis is sometimes needed to make sure that
the cases are equally likely. For example, let two coins be tossed, and
consider the probability that they both show heads. We might reason
that the total number of cases is three, namely, two heads, a head and a
tail, or two tails. Since only one case is favorable, the probability is J/3-
Now, this reasoning is incorrect. It is true that there are three eases, but
these cases are not equally likely. The case of a head and a tail is twice
as likely as the others, since it can be realized with a head on the first
coin or with a head on the second coin The reader can verify that there
are four equally likely cases and that the required probability is
If an event is certain to happen, then its probability is 1 , since all cases
are favorable. On the other hand if an event is certain not to happen its
probability is zero, since no case is favorable. By means of the definition
the reader may also verify the important equation
g = 1 - p,
where p is the probability that an event happens and q the probability
that it fails to happen.
Since one must begin somewhere, it is impossible to define everything, and every
mathematical theory contains some undefined terms. These terms should be so simple
that they are easily understood and also so simple that they are not readily defined in
terms of anything simpler. The notion “equally likely" is an example of such a term;
it was explained and illustrated in the foregoing discussion but not defined.
FUNDAMENTALS OF PROBABILITY THEORY
SEC. 1]
611
Example L If a pair of dice is thrown what is the probability that a total of 8 shows?
The first die can fall in 6 ways, and for each of these the second can also fall in 6 ways.
The total number of ways is
6+6+6+64-6+6 - 6-6 - 36
and these are equally likely in this problem. A sum of 8 can be obtained in 6 ways,
namely, as
2 + 6, 6 + 2, 3 + 5, 5+3, 4+4
and hence the desired probability is
This computation of the total number of cases illustrates an important principle of
combinatory analysis: If one. thing can be done in n different ways and another thing can
be done in m different ways, then both things can be done together or in succession in mn
different ways.
Example 2. In a well-shuffled deck what is the probability that the top 4 cards are,
respectively, ace, two, three, and four of hearts?
To find the number of equally likely cases we consider the various possibilities for the
top 4 cards. The first card may be any one of 52; for each determination of that card
there remain 51 possibilities for the next; and so on. Repeated use of the principle
mentioned at the end of the last example gives
52*51 -50-49
for the total number of cases. Since only one case is favorable, the desired probability
is the reciprocal of this.
When r things are dealt into r numbered spaces from a stack of n distinct things, then
any particular arrangement of the objects is called “a permutation of n things r at a
time.” If the total number of such permutations be denoted by n P r , the foregoing
reasoning yields the important formula
n P r - n(n - l)(n - 2) . . . (n - r + 1).
Example 3. If a hand of 4 cards is dealt from a shuffled deck what is the probability
that the hand consists of ace, two, three, and four of hearts?
The difference between this example and the preceding is that now the order is not
relevant. Let C denote the number of distinct 4-card hands, not counting order. Then
the number of distinct 4-card hands when the order is counted is
C.4P4,
since each hand of 4 cards admits 4 P 4 different orderings of its members. On the other
hand the number of distinct 4~card hands when order is counted is also equal to
by Example 2. We have, therefore,
C 4P4 “ &3P4,
so that
6sP 4 52-51*50*49 ^ 52!
C "" 4 P 4 ** 4-3-21 ~4!48l‘
The desired probability is the reciprocal.
When r things are taken from a stack of n things, the groups, so obtained are called
“combinations of n things r at a time.” If the number of such combinations is denoted
by n C r , the above reasoning gives the important formula
^ fJPr n\
612
PROBABILITY
[CHAP. 8
In this formula the arrangement of members in a group is not considered. As in the case
of poker hands, two groups are counted as distinct only if they have different composi-
tions.
Example 4. What is the probability of drawing 4 white, 3 black, and 2 red balls from
an urn containing 10 white, 4 black, and 3 red balls?
We suppose that the bails are not replaced. The number of ways to get 9 balls from
the 17 is it C#. The number of ways to get 4 white from the 10 white is 10C4. The 3
black balls can be chosen in aC$ ways, and the 2 red ones in 3C2 ways. The number of
favorable cases is found by multiplication (cf. Example 1), so that the desired proba-
bility is
10C4 '4C3 'jCj 252
- 2^3 r
Example 5. If a number x is chosen at random on the interval 0 < x < 1, what is the
probability that K < £ < M?
- f 4 j We imagine the unit interval divided
into 7 segments each of length Y (Fig. 1).
* ia * Since the point may be in any one of
these there are 7 cases, and the phrase
“at random” ensures that these cases are equally likely. Since only 2 cases are favor-
able, the desired probability is .
PROBLEMS
1. What is the probability that the sum of 7 appears in a single throw with two dice?
What is the probability of the sum of 1 1 ? Show that 7 is the most probable throw.
2 . An urn contains 20 balls: 10 white, 7 black, and 3 red. What is the probability
that a ball drawn at random is red? White? Black? If 2 balls are drawn, what is the
probability that both are white? If 10 balls are drawn, what is the probability that 5
are white, 2 black, and 3 red?
2 . “If 3 coins are tossed, some pair is sure to come down alike. The chance that the
third coin fell the same way as that pair is Y%\ and hence the probability that ail 3 fall
alike is What (if anything) is wrong with this argument? What is the proba-
bility that 3 coins will fall alike?
4 . What is the probability that a 5-card hand at poker consists of 4 kings and an odd
card? 5 spades? A sequence in the same suit, such as 2, 3, 4, 5, 6 of hearts?
6. In how many ways can you seat 8 persons at a table? Arrange 8 children in a ring
to dance around a Maypole? Make a bracelet of 8 different beads on a loop of string?
8. The seats in a concert hall are arranged in an ?n by n rectangle, the side m being
parallel to the stage. What is the chance that a ticket bought at random will be for a
seat in back? On the side? Somewhere on the outside rows of the rectangle?
7 . Two dice are tossed, (o) What is the probability that the first die shows 2? (b) Sup-
pose you are given the additional information that the total shown by both dice is 9.
What is now the probability that the first die shows 2? (c) If no information is given,
what is the probability that the total shown is 3? (d) If it is known that the first die
gave 2, what is now the probability that the total is 3? (Assume that the various
numbers on the second die are equally likely no matter what is known about the first die.)
% Sample Space. The equally likely cases associated with the definition
of probability represent the possible outcomes of an experiment. For
instance, the 36 equally likely cases associated with a pair of dice are the
36 ways the dice may fall. Similarly if 3 coins are tossed, there are 8
f
SEC. 2] FUNDAMENTALS OF PROBABILITY THEORY 813
equally likely cases corresponding to the 8 possible outcomes of that
experiment. The set of all possible outcomes is called a sample space;
the “points” of the sample space are events. This notion of sample space
is meaningful even when the events are not equally likely and even if
there are infinitely many possible outcomes. For technical reasons, how-
ever, the events composing the sample space are required to be mutually
exclusive. In tossing a die the events “an even number shows” and “6
shows” are not suitable for one and the same sample space.
A finite sample space is one which has only a finite number of points.
In such a space let the points (that is, events) have respective probabilities
Pi) VZy • * • ) Vn
with Pi + p 2 + b pn « 1.
Suppose the first m sample points, and only those, are favorable to another
event A. Then we define the probability of A to be
p(A) * pi + p 2 H h Pm (2-1)
(and similarly if some other set of sample points is in question). Thus,
the points of the sample space are weighted according to their probabilities.
The reader should observe that this definition is consistent with that of
the foregoing section: If each point of the sample space has the same
probability 1/n, the result (2-1) becomes
11 1 m
P(A) »- + - + •••+-- —
n n n n
Sample spaces with constant probability are called uniform.
For an example of a nonuniform sample space, consider the following
experiment: Four coins are tossed, and we are interested in the number of
heads. An appropriate sample space is composed of the events
no heads, one head, two heads, three heads, four heads
with respective probabilities, or weights,
He* He; He; H6; He-
These values are found by counting cases, as follows. The 4 coins can fall in SS 4 , or
16, ways. They give no heads in only one case, namely, when they all fall tails, and
hence the required probability is 3de- To obtain 1 head there are 4 cases: heads on the
first coin or on the second coin, and so on. This gives }{$• Tor 2 heads, the 2 coins
giving heads can be any 2 of the 4 coins. Since there are 4 C 2 * 6 ways to choose 2 coins
out of 4, there are 6 cases favorable to the event, two heads. The probability, then,
is The other entries are found in the same way, or by symmetry.
To illustrate the use of this sample space, let us find the probability
of getting at least two heads. Since the last three points of the sample
PROBABILITY
§14
[chap. 8
space, and only those, are favorable to this event, the required probability
is
He + He + He - l He>
Again, the probability that there is an odd number of heads is
He + He * H>
since that event corresponds to the second and fourth point. On the other
hand, this sample space does not give the probability that the third coin
will fall heads, although the underlying uniform space tells us that the
probability is
Additional information concerning the experimental situation is apt to
change the sample space. For example, if a toss of a die is known to have
given an even number, the probabilities of 1, 3, and 5 are changed from
H to 0. This question is discussed in Examples 2 and 3.
When two sample spaces are constructed for a given experiment by the procedure of
the text, it can be shown that they are consistent; that is, they give the same probability
for any event to which they both apply. This fact is illustrated in the problems, though
we do not give a formal proof.
The notion of sample space enables us to define probability even when there is no
underlying set of equally likely cases. Suppose we are given n events and a corre-
sponding set of nonnegative numbers p» such that pi + m H b pn *■ 1. The events
are said to form a sample space , the numbers p, are called probabilities , and the proba-
bility of various associated events is defined by addition, as in the text. This abstract
idea can be extended to sets of very general type, the role of the numbers p, being taken
by a so-called measure on the sot. With such an approach probability theory is included
in a branch of mathematics known as the theory of measure. 1 A sample space defined
with the help of arbitrary numbers p, is considered in Example 1.
Example 1. A loaded die has probabilities
PU P*> Ps, Pi, Pb, P«
of giving the respective values
1, 2, 3, 4, 5, 6.
What is the meaning of the condition pi + P 2 H f pe — X? If this condition is satis-
fied, find the probability that a single toss will give either a 4 or a 6.
The condition means that one of the stated alternatives will certainly happen; for
instance, the die does not land on edge. From a more abstract viewpoint, the condition
means simply that the given events and probabilities form a sample space. When that
is the case, the probability of getting 4 or 6 is pi -f* p« by definition.
The assumption that “the probabilities are p % ” is an example of a statistical hypothesis.
It is an important task of statistical theory to test the validity of such hypotheses by
examining the consequences.
The reader should notice that the values pi were not given, and could hardly be
given, by considering “equally likely cases." They may be estimated, however, by
repeatedly tossing the die. When p\ is the probability of the ace, it can be shown that
the proportion of aces actually observed, in a large number of tosses, is likely to be close
* See Appendix C.
FUNDAMENTALS OF PROBABILITY THEORY
SEC. 2]
615
to pi* If there are n tosses, and if m aces are observed, this proportion m/n is called
the relative frequency. The connection between probability and relative frequency is
discussed in Secs. 8 to 10.
Example 2. Two coins are tossed. Suppose a reliable witness tells us ,f at least 1 coin
showed heads." What effect does this have on the uniform sample space?
The uniform sample space had the following appearance before we received the extra
information:
Event
HH
TH
HT
TT
Probability
X
X
x
x
The new information assures us that the last event is ruled out but gives no indication
concerning which of the other three may have occurred. Since these three events were
equally likely to begin with, they arc considered to be equally likely in the new situa-
tion. (That is not a theorem, but an axiom of probability theory.) The new sample
space, therefore, is
Event
HH
TH
HT
TT
Probability . . .
X
X
X i
0
Example 3. The tossing of 2 coins can be described by the following sample space:
Event
no heads
one head
two heads
Probability
X
X
X
What happens to this sample space if we know that at least 1 coin showed heads but
have no other special information?
The first event is ruled out, but we are not told which of the remaining ones occurred.
It. is an axiom of probability theory that the relative probabilities of the remaining
events remain unchanged in a situation such as this. Since the event “ 1 head" is twdce
as likely as “2 heads" in the original Bpace, the same is assumed in the new one. The
new sample space is therefore
Event
no heads
one head j
two heads
Probability
0
x i
X
(remember that the probabilities must add up to 1). The reader should check that this
result is consistent with that of Example 2.
If the events E\, E% ...» Ek of the sample space are the ones favorable to A and have
probabilities p\ , p%, . . . , Pk, the information that A happened gives a new sample space
with events E\ } En, . . Ek only. The probabilities on that new sample space are
cpi, cpz, . . cpk,
where c is a constant so chosen that the sum is 1 :
1
C m *
Pi + Pi + ' * * + Pk
This is the general assertion which is illustrated in Examples 2 and 3.
616
PROBABILITY
[CHAP. 8
PROBLEMS
1. A coin is tossed 3 times. Construct a uniform sample space for this experiment.
(That is, make a table showing the 8 possible outcomes HHH, HHT, . . . and their re-
spective probabilities . , . .) According to your sample space what is the proba-
bility of at least one H? At most one H? A run of exactly two H’s in succession? A run
of at least two H’s in succession? H appearing before T? H appearing for the first
time in the second toss? The sequence THT? The sequence TTT?
2. In Prob. 1 suppose we are concerned only with the number of H’s. Construct an
appropriate sample space. (That is, make a table showing the 4 possible outcomes: no
Ha, one H, . . . with their respective probabilities.) Decide which questions in Prob. 1
Can be answered on the basis of this sample space, answer them, and verify the agree-
ment with your answers to Prob. 1.
3. The following argument is attributed to Leibniz; “A total of 12 with 2 dice is just
as likely as a total of 11. For, 12 can materialize in just one way, namely, by getting
6 on one die and 6 on the other; and 11 can also materialize in just one way, namely, by
getting 6 on one die and 5 on the other.” Using the notion of sample space explain
what is wrong with Leibniz’ conclusion. (With the uniform sample space, 11 can
materialize in 2 ways. On the other hand if we choose a sample space in which the
event “6 on one die and 5 on the other” is a single point, the weight of this point is
different from that of the point ”6 on one die and 6 on the other.” The student should
verify these remarks in detail)
4 . The following is due to d’Alembert: “If we want to get at least one head with 2
tosses of a coin, heads on the first toss makes the second toss unnecessary. Bo there are
3 cases, H, TH, and TT, of which 2 are favorable to heads. Hence the probability of
heads is Discuss, with reference to the uniform sample space and also with refer-
ence to the sample space ’which has only the three points H, TH, TT. (Ambiguities
such as this and the preceding can cause serious errors in practice if the notion of sample
space is not well understood. In fact, one of the reasons for defining the sample space
is to avoid this kind of difficulty.)
5. What happens to the uniform space associated with a pair of dice, if we are told
that the total shown is 7?
6. Four coins are tossed. A reliable witness tells us that there are at least as many
heads as tails. What is the most probable number of heads, and what is its probability?
Suggestion * Use the sample space given in the text.
7 . A coin is tossed 3 times. If we know that a sequence of 2 tails in a row did not
occur, what is the probability that a sequence of 3 heads in a row did occur? Suggestion:
Use the uniform sample space,
3. The Theorems of Total and Compound Probability. Statements about
probability are often given an abbreviated notation. If A and B are
events, AB means the event “A and B”; that is, AB happens only when
both A and B happen. For example, if two cards are drawn in succession
without replacing, suppose A is the event “the first draw gives a king”
and B is the event “the second draw gives an ace.” Then AB happens
if we get a king on the first draw followed by ace on the second.
< It is customary to write p(A) for “the probability of the event A.”
Ih the foregoing example p(A) = since there are 4 kings among the
52 cards. If nothing is known about the results of the first draw, then
p(B) « H% also.
FUNDAMENTALS OF PROBABILITY THEORY
61?
BBC. 31
To see this, note that the total number of cases is 52*51, since there are 52 ways to
get the first card and, when that card is chosen, 51 ways remain to get the second. To
count the cases favorable to B , observe that the ace obtained on the second draw may
be any one of the 4 aces. For each choice of this ace there remain 51 possibilities for
the first card. The number of favorable cases, then, is 4*51, and hence
V(B)
4 51 4
52 51 “ 52*
(3-D
Sometimes two events A and B are so related that the information that
A happened changes the probability of B. To deal with this situation it
is customary to write Pa{B) for "the probability of £, given A” In the
example cited previously,
Va(B) = Hi (3-2)
(for if A happened, the first draw gave a king, and hence the 4 aces are
to be found among the remaining 51 cards). On the other hand when A
is the event “the first draw gives an ace” and B t as before, is the event
“the second draw gives an ace,” then Pa {B) ~ (since now only 3
aces remain when A happens). Both values for Pa(B) are different from
p{B), the probability of ace on the second draw when nothing is said about
the first draw.
In this notation the theorem of compound probability takes the following
form :
Theorem If A and B are any events , then
p{AB) - p(A)p A (B ). (3-3)
Informally, “the probability that A and B happen is the probability
that A happens times the probability that B then happens.” A proof is
easily given by considering equally likely cases. Let n (l , n bl and n ab denote
the numbers of cases favorable to A y B, and AB, respectively. Then
P(AB) =
riab
U a Tl a b
n n a
Now, n a /n is p(A) by definition. After A has happened, the only possible
cases are the n a cases favorable to A. Of these, there are n ab cases favor-
able to B. Since the n a cases are to be considered equally likely, the
quotient n a i/n« represents the probability of B when it is known that
,4 happened, and this gives (3-3).
To illustrate the theorem (3-3), let us find the probability of drawing
2 aces in succession from a pack of 52 cards. The probability of ace on
the first trial is %%. After the first ace has been drawn, the probability
of drawing another ace from the remaining 51 cards is 2£i> so that the
probability of two aces is
%2 ‘Hi * /I221-
PROBABILITY
018
[chap. 8
This assumes that the first card is not replaced. When it is replaced, the
reader will find that the desired probability is
%T%2 ** K69-
For another illustration of the theorem (3-3), lot us find the probability
of drawing a white and a black ball in succession from an urn containing
30 black balls and 20 white balls. Here the probability of drawing a white
ball is 2 %o- After a white ball is drawn, the probability of drawing a
black ball is 3 %g. Hence the probability of drawing a white ball and a
black ball in the order stated is
V « 2 %o-% - l H 9-
The events A and B are said to be independent if the information that
A happened does not influence the probability of B . Hence for such
events Pa(B) « p(B), and the theorem of compound probability takes the
form
p(AB) = p(A)p(B) } for independent events. (3-4)
For instance, let a coin and a die be tossed, and let A be the event “head
shows’’ while B is the event “4 shows.” These events are independent,
and hence the probability that heads and 4 both appear is
p(AB) « p(A)p(B) - (H)Q4) - i{ 2 .
The result (3-4) is readily extended to any number of independent events
Besides the theorem of compound probability, there is a second funda-
mental relationship, known as the
theorem of total probability. If A
and B are two events, A + B is
defined to be the event U A or B
or both.” For instance, let A be
the event “a number greater than
3 shows” while B is the event “an
even number shows” in a toss of
a die. Then A + B happens if the die gives 2, 4, 5, or 6. In this
notation the theorem of total probability reads as follows:
Theorem. When A and B are any events , then
p(A + B) = p(A) + p(B) - p{AB). (3-5)
We can represent the statement (3-5) diagrammatieally by the intersecting
point sets A and B shown in Fig. 2.
Referring to the definition of probability by equally likely cases, suppose
the numbers of cases favorable to A, B, AB, and A + B are denoted by
SEC. 3]
FUNDAMENTALS OF PROBABILITY THEORY
619
tl af Tll» Hab) ^04
respectively. To find the number favorable to A + B f it will not do
simply to add n a and rib , for the eases favorable to both A and B are counted
twice in this addition. To take account of that we must subtract n a i,
thus:
n a +b = n a + n b - n ab .
Dividing by n, the total number of cases, gives
Wa-f-6 Ra ^ H b Hab
n n n n
which is equivalent to (3-5).
To iliustrale the theorem, let us find the probability that at least one
die gives 4, when two dice are tossed. The probability that both give 4
is The probability that the first gives 4 is and similarly for the
second. Hence the probability that at least one gives 4 is
p(A + B) =* + 14> — M6 “ l /i$- (3-6)
Thus is consistent with the icsult given by counting cases Specifically, there are 5
cases with a 4 on the first and a number other than 4 on the second, there are 5 cases
with 4 on the second and a number other than 4 on the first, and there is I case with 4
on both The number of favorable cases is therefore 5+5-fl *11 so that (S-ti)
follows
For mutually exclusive events, that is, for events A f B which cannot
both happen, p(AB) = 0. Hence the theorem of total probability takes
the form
p(A + B) — p(A) + p(B ), for mutually exclusive events. (3-7)
The statement (3-7) can be depicted
by the nonintersecting point sets in
Fig 3
For example, in a toss of a die let
A be the event “4 shows” while B FlG - 3
is the event “5 shows,” Since these
events are mutually exclusive, the probability of getting either 4 or 5 is
p(A + B) - p(A) + p(B) « H + H =
A result similar to (3-7) applies to any number of mutually exclusive
events A, B, ( 7 ,
The foregoing analysis, by counting cases, establishes the theorems of
total and compound probability for uniform sample spaces only. Actually
the results are valid for arbitrary sample spaces, as will be indicated next.
PROBABILITY
620
[chap. 8
Assuming that the sample space is finite, let the events Ei of the sample
space be so numbered that
Ei y . . Ej
are favorable to A alone,
• ♦ •> &k
are favorable to both A and B , while
Eic+i, . . . , E m
are favorable to B alone. If the associated probabilities are p t , then (3-5)
is equivalent to the identity
Pi H h Pm - (Pi H f py + Py+i H bpk)
+ (pj-fi H b Pfc + Pk+i H h Pm) — (Pj-fi H h p*).
The three parentheses on the right represent, respectively, p(A), p(B) f
and p(AB) by definition.
To derive (3-3) for a general sample space, recall that the sample points
favorable to B have the same relative weights after A happened as before.
Hence, in the previous notation,
p(AB) = p j+ 1 h Pk
(pH b Pk)
Pj+i
<Pi H b Pk
■+•••+■
Pk
PiH b pk)
« p(A)p^(^).
Example 1. The probability that Peter will solve a problem is pi, and the probability
that Paul will solve it is p%. What is the probability that the problem will be solved if
Peter and Paul work independently?
The probability that both solve it is Pipi, by the theorem of compound probability,
(3-3). Hence the probability that at least one solves it is
Pi + P2 ~ pm (3-8)
by the theorem of total probability, (3-5).
Example 2. Solve Example 1 by finding the probability that both fail.
Peter's probability to fail is 1 — pi, and Paul’s probability to fail is 1 — p 2 * The
probability that both fail is
(1 - p t )(\ - pa)
and the probability of the contrary event, that at least one succeeds, is
1 - (1 - Pi)(l - P 2 ). (3-9)
The consistency of (3-8) and (3-9) is easily verified.
Example 3, A bag contains 10 white balls and 15 black bails. Two balls are drawn
in succession. What is the probability that one of them is black and the other is white?
The mutually exclusive events in this problem are (a) drawing a white ball on the
first trial and h black ball on the second, (6) drawing a black ball on the first trial and
SBC. 3] FUNDAMENTALS OF PROBABILITY THEORY 621
a white on the second. The probability of (o) is • 1 and that of (6) is • ! % 4l
so that the probability of either (a) or (i>) is
10 Ai + - M.
Example 4. How often must a pair of dice be tossed to make it more likely than not
that double 6 appears at least once?
The probability that double 6 does not appear on a given toss is s %e>, no matter
what is known about the preceding tosses. Repeated use of the theorem of compound
probability gives
(•He)"
for the probability that double 6 does not appear in any of n tosses. It is desired to
choose n in such a way that this probability is less than l A • Thus,
{*%*)" < K.
Taking the logarithm gives
< - log 2
or
n >
log 2
log 3 %5
24.6.
Thus 25 tosses suffice, but 24 do not.
Example 5. Peter and Paul take turns tossing a pair of dice. The first to get a throw
of 7 wins. If Peter starts the game, how much better arc his chances of winning than
Paul’s?
This problem is different from any we have considered hitherto, in that there are
infinitely many possibilities. Namely, Peter may win on his first throw, or on his second
thiow, or on his third throw, and so forth. To apply the preceding theory, we simpty
consider the probability that Peter wins in n throws and take the limit as n oo.
A wide variety of questions involving infinitely many outcomes may be dealt with in a
similar maimer.
The probability of 7 is 1<>, and the probability of not getting 7 is %. Ifcnce the
probability that Peter wins on his first throw is The probability that Peter wins
on his second throw is (?o) 2 (/6) (since Peter’s first throw and Paul s first throw must be
other than 7 but Peter s second throw must be 7). Peter’s probability of winning on his
third throw is (^o) 4 (3ti)* and so on
By the theorem of total probability the probability that Peter wins is
H + ( H)(K ) 2 + (hXH) 4 + • • * - ( W + r 4- r 2 + . • .), where r * (%)*
1 l 1 1 6
“ei-r'ei - * li '
(3-10)
A similar procedure shows that Paul’s chance of winning is V\i, or one can reason as
follows: The probability that 7 does not occur m n trials is ( 5, g) n . Since the limit is
zero, the probability of an eternal game is zero, and Peter or Paul is sure to win. Thus,
Paul’s chance is
1 - Hi * Hi-
PROBLEMS
1. What is the probability that 5 cards dealt from a pack of 52 cards are all of the
same suit?
622 PROBABILITY [CHAP. 8
3 . Five coins are tossed simultaneously. What is the probability that at least one of
them shows a head? All show heads?
8 . The probability that Paul w r ill be alive 10 years hence is % and that John will be
alive is What is the probability that both Paul and John will be dead 10 years henoe?
Paul alive and John dead? John alive and Paul dead?
4 . One purse contains 3 silver and 7 gold coins; another purse contains 4 silver and 8
gold coins. A purse is chosen at random, and a coin is drawn from it. What is the
probability that it is a gold coin?
5* Paul and Peter are alternately throwing a pair of dice. The first man to throw a
doublet is to win. If Paul throws first, what is his chance of winning on his first throw?
What is the probability that Paul fails and Peter wins on his first throw?
8. How many times must a die be thrown in order that the probability that the ace
appear at least once shall be greater than Hs?
7. Twenty tickets are numbered from 1 to 20, and one of them is drawn at random.
What is the probability that the number is a multiple of 5 or 7? A multiple of 3 or 5?
Note that in solving the second part of this problem, it is incorrect to reason as follows:
The number of tickets bearing numerals that are multiples of 3 is 0, and the number of
multiples of 5 is 4. Hence the probability that the number drawn is either a multiple
of 3 or of 5 is %o 4* Ho *= Vi- Why Is this reasoning incorrect?
8 . A card is chosen at random from each of 5 decks What is the probability that all
are face cards? Would the probability be larger or smaller if all 5 cards were taken from
one deck, without replacing?
9. Answer the two questions in Prob. 8 if the desired hand is 1, 2, 3, 4, 5 of clubs; if
the desired hand is to have at least 2 aces but is otherwise unrestricted.
10 . Each of two radio tubes has probability p of burning out during the first 100 hr
use. If both are put into service at the same tune, what is the probability that at least
one of them is still good after 100 hr? Generalize to n tubes. If p « 0.1, how many
tubes are needed to give a probability > 0.99 that at least one is good after 100 hr?
4. Random Variables and Expectation. 1 A process is random if it is
impossible to predict the final state from the initial state (as, for example,
in a toss of a coin or a die). Associated with a random process there may
be certain numerically valued variables which themselves have a random
character. For instance, if A" denotes the number obtained by tossing a
die, then X is a variable which assumes the values
1, 2, 3, 4, 5, 6
corresponding to the six events: 1 shows, 2 shows, and so forth. The re-
spective probabilities are
H, K, K, H> H> H-
Again if X is the number of heads obtained when 3 coins are tossed, then
X is a variable which assumes the values
0, 1, 1, 1, 2, 2, 2, 3 (4-1)
Actions 4 through 6 may be omitted on the first reading without loss of continuity,
but they are essential to the developments in Sec. 13.
SEC. 4 ] FUNDAMENTALS OF PROBABILITY THEORY 623
corresponding to the various ways the coins may fall. For instance,
X *= 2 corresponds to each of the throe events: HHT, HTH, THH.
Similarly, if a gambler stakes d dollars on a game, the amount he wins
assumes the values
d y —d
in correspondence with the events “he wins the game” and “he loses the
game.” If his probability of winning is p, the respective probabilities of
X = d and X = —d are
p, 1 - p.
These special cases illustrate the important idea of random variable . A
random variable is a numerical-valued function defined on a sample space
In symbols,
X(c t ) = x t) i = 1, 2, n, (4-2)
where e x are the events of the sample space and x t are the values of the
random variable A".
Let (e t ) be a sample space of n events c, with associated probabilities
p t . Let X be a random variable defined on jc,) and assuming the value
x x at the fth sample point, so that (4-2) holds. The expectation or expected
value E(X) is then defined by
E(X) = PiJi + P2*2 A f~ Pn*n. (4-6)
For example, if X is the number obtained in a toss of a die, then X
assumes the values 1, 2, 3, ... with corresponding probabilities p % = 34*
I fence
E(X) = H , l + K2-h34'3 + 3^*4 + K*5 + J^*6 = %.
Similarly, if A" is the number of heads obtained when 3 coins are tossed,
then (4-1) and (4-3) give
E(X) ~ /8 + H + H+M + H + % + % + % -
when we note that p, — 1 # in this ca.se.
By grouping terms we can write t he above sum in the form
E(X) « J^‘0 + V! + V2 + H-3.
The factors
0, 1, 2, 3
represent the numerically distinct values of X, and the factors
Vs, H, H, H
represent tlxe probabilities corresponding to these distinct values. For
example, % is the probability of 2 heads when 3 coins are tossed, and
m
PROBABILITY
{CHAP. 8
hence % is the probability that X « 2. A similar grouping of terms can
be applied to the general definition (4-3) and yields the following useful
theorem:
Theorem I. The expectation E(X) is given by
E(X) = P\X\ + P 2 X 2 H — * + P r x r
where x X) x 2 , . . , , x r are the numerically distinct values of X and where Pi
is the probability that X = x<.
Let Xi be the r distinct values of a random variable X, and let yj be the
9 distinct values of another random variable Y. The sum X -f Y is a
random variable which is defined to be x % + y 3 when X ~ x, and Y = yj.
Thus, X + Y is defined on a sample space whose points consist of the rs
events
X = Xj and Y ~ Vj (4-4)
for i « 1, 2, . . r and j = 1, 2, . . s. One of the most important theo-
rems in probability theory concerns sums of variables and reads as follows:
Theorem II. The expectation of the sum of two random variables is equal
to the sum of the expectations, or in symbols }
E(X + Y) « E(X) + E(Y). (4-5)
To prove Eq. (4-5) let p tJ be the probability that simultaneously X = x t -
and Y = yj. Thus, is the probability of the event (4-4). The definition
of expectation yields
E(X -f Y) =23 Pa( x i + 2/y), (4-6)
since Xi + yj is the value of X + Y which corresponds to the event (4-4).
By rearrangement,
E{X + Y) = Z (Z P.y) + Z Vi (Z Pa) • (4-7)
Now, ZP»; j represents the probability of
(X * x t , Y ** yi) or (X = x if Y = y 2 ) ... or
(X - Xi, Y « y s ).
Hence, it represents 1 the probability P % that X = x t . Theorem I now gives
Z *«• (Z P»y) - Z x t Pi~E(X)
and similarly,
Z P; (Z Pa) = E(Y).
1 This shows that 2 Xpy * 2P, » 1, hence that the events (4-4) actually do formi
•ample space.
FUNDAMENTALS OF PROBABILITY THEORY
SEC. 4]
625
Thus, (4-7) is equivalent to (4-5). The extension to any number of vari-
ables is immediate.
The following alternative approach to Theorem II does not require the use of Theo-
rera I. Let X be defined on a sample space {a*} containing n events and Y on a space
| bj] containing m events. Thus,
X(<h) ** Zt and Y(bj) - yj.
The variable A" Y is defined on a sample space whose mn events e tJ happen when, and
only when, a t and b 3 both happen. The value of X 4* Y corresponding to the event e,y
is defined to be -+* Vy If P 13 is the piobabiiity of e lJ} then the definition of expecta-
tion gives (4-(>), which may be written in the form (4-7) as betfore. Since the events of
the sample space {M are mutually exclusive (Sec. 2), the sum 2D Pv represents the
J
probability of
a, and b\ or o, and 62 ... or a,- and b m .
Hence it represents p t , the probability of a x . The first term in (4-7) is therefore E(X)
by (4-3), and similarly, the second term is E(Y).
The sums
L, Vtj and 2J P*j (4-8)
j *
are called the marginal probabilities of a, and b Jy respectively. In modern statistical
theory it is customary to start with the larger sample space jc v { and to define the
probabilities on the smaller spaces |o t | and |fr ; | by means of (4-8). Theorem II is then
valid, so to say, by fiat.
Since Xp x « 1,
center of mass:
the expectation E( X) in (4-3) may be interpreted as the
E(X) -
Pl*\ + p 2 X 2 H h PnXn
Pi + P2 d h Pn
For equally likely x t the result reduces to the arithmetic mean
1 1
E( X) » - (xi + x 2 H b x n ), if each = —
n n
Thus, E(X) is a measure of the location of X; it is a typical value . The
following sections show that if sufficiently many observations of the
variable X are made, the mean of those observations will almost certainly
be close to E(X). In this sense, E{X) represents the average value at-
tained by X in the long run.
Throughout this section random variables were denoted by capital letters to avoid
confusion between the variable and its values x % or y 3 . In statistical literature the varia-
bles are usually denoted by small letters. Since the distinction has now been sufficiently
emphasized, we shall often use small letters in the remainder of this chapter. Thus,
depending on the context, x x may be a set of random variables or the values of a single
variable.
626
PROBABILITY
[CHAP, 8
Example 1. Find the expected number of heads when n coins are tossed.
Let X* *= 1 if the ith coin shows heads and X» » 0 otherwise. Then, for each i t
E{Xi) « H I 4 - H O m
(The reader is cautioned that X\, X% ... are distinct variables here, not the different
values Xi of a single variable.) The number of heads m is
and hence
m » X\ + X 2 4 — * 4 - X n}
E(m) -ElXi+X* + --- + X„)
= E( Xi) + E(X Z ) + E(X n )
n
2 *
Example 2. From an urn containing a white and 6 black balls, a ball is drawn at ran-
dom and set aside. What is the expected number of white balls left in the urn?
Let X be the number of white balls left. If a white ball is drawn, then X ** a — 1,
whereas if a black ball is drawn, then X * a. Hence
E(X)
a +b
(«-!) +
a -f b
a
a -f- b
Example 3. A deck of cards is thoroughly shuffled. We say there is a coincidence if
a card has the same position after shuffling as it had before (e g , if it is the fourth from
the top both times). Find the expected number of coincidences.
Let A r , = 1 if the tth card is in the same position before and after shuffling, and let
X, ** 0 otherwise. Then
E(X % ) - J< 2 .1 + 6 « 2 0 -
Since the number of coincidences is XX its expectation is
tf(Xi) + E(X t ) + ..-4 -E(Xri - 1.
PROBLEMS
1. A bent coin has probability p of giving heads and probability q « 1 — p of giving
tails. Let X be a random variable representing the number of heads when the coin is
tossed three times; X is defined on a sample space consisting of the 8 events HHH,
HHT, . . . with associated probabilities p 3 , p 2 q t (a) Make a table giving the 8 values
of X associated with the 8 sample points and their respective probabilities, (b) Make
a second table giving the 4 distinct values of X and their probabilities, (c) Compute
the expectation E(X) from your table (a) and also from your table (6).
2. If X is the number of heads and Y the number of tails, find E(XY) from your
table in Prob. la and also from that in 16. Is it true that E(X Y) * E(X)E(Y)? Hint
Make a table giving the 4 values of AT in the 4 cases of Prob. 16.
8. Peter turns up the cards one at a time from a 52-card deck, and Paul tries to guess
what the cards are. Find the expected number of correct guesses (a) when Paul calls
out at random, perhaps repeating himself, (6) when Paul calls off the 52 cards, naming
each one just once, (c) when Paul calls out “ace of spades" each time. (Assume that
" Paul has no actual insight into the behavior of the cards.)
4. In Prob. 3, suppose Peter tells Paul what the card was immediately after Paul
guesses. Paul has ihe good sense not to call any of those cards, since he knows they
FUNDAMENTALS OF PROBABILITY THEORY
627
SEC. 6]
have been set aside* What is the expected number of correct, guesses now? Hint: Let
A\ *» 1 if the rth guess is correct, X* *■ 0 otherwise. E(X t ) « ? The expected number
of correct guesses will t>e found to be approximately log# 52.
5 . A coin is tossed repeatedly. What is the expected number of the toss at which
heads first appear? Hint: Let X be the numt>er of the toss at which heads first appear.
Then X has the values 1, 2, 3, ... with resjxjetive probabilities V% 34 , H, The
reader is reminded that Xnr n * r/(l — r ) 2 for any r such that |r | < 1.
5. Discrete Distributions. When the values x x of a random variable are
distinct, the associated probabilities p t may be written in the form
P. = /(a-.).
Since the ,r,' are supposed to be all the possible values of x, we must have
2/0 r.) = 1, (5-1)
just as in the last section 2 p t = 1. Also fix) > 0, because fix) is a proba-
bilitv.
For example, let .r be the number of heads obtained when 4 coins are
tossed. If the value x — 0, 1, 2, 3, or 4 is given, then the probability to
assume that value is determined by the table
X «=
0
!
1
2
3
4
fix) -
1 16
6
*16 i
The function /(x) is called the frequency function for reasons which will
now be explained. Suppose n observations of the variable x are made;
how often should we expect .r = x t ? To answer this question, let Xk = 1
it .r ~ x t at the kth observation and Xk — 0 otherwise. The number of
times x =• x, is
m — X i + X<z + * * * + X n .
Since the definition of expectation gives
E(X k ) = 1 •/(*,) + 0[1 - fix,)) = f(x t ),
we have the fundamental result
E(m) = nfi x t ). (5-2)
Thus, the frequency function /(xj is proportional to the expected frequency
of the event x *= x t in a fixed number of observations.
Since the values x* are distinct, the events x = X\ and x ~ x 2 are
mutually exclusive. Hence, by total probability, the probability of
x « x\ or x * X 2 is
/(: ri) + /(* a).
628 PROBABILITY [CHAP. 8
In just the same way the probability of x *» or x 2 , or X* is
£/(*,). (M)
*-l
It is often desirable to consider the probability that x will not exceed
a given value. If x\, x 2> . . Xk are the values of which do not exceed
t y then the probability that x < t is given by the sum (5-3). That is,
the event u x < t” is equivalent to the event “x «* Xi, or x = x 2y . . or
x * 2 *.” It is customary to write
no = E /(*> (w)
xiUkt
for summation over the values of x t which do not exceed /. The function
F(t) thus obtained is called the distribution function; it gives the probability
that x < t. When t is so small that no x t satisfies x % < t, the sum (5-4)
has no terms, and F(t) = 0 for such t. When t is so large that every x t
satisfies < t , then the sum (5-4) includes every x x . In this case (54)
gives the value F(t) * 1 .
For example, if x is the number of heads obtained when 4 coins are tossed,
the distribution function is described by the following table:
t
t < 0
0 < t < 1
1 < t < 2
2 < t < 3
3 < t < 4
4 < t
F(t)
0
He 1
He
‘He
‘•He
1
These entries are obtained by adding the values of f(x) which were found
previously. For instance, 1 }{q corresponds to the interval 2 < t < 3
because
£ f(*i) - m + fW + /(2) - M's + Me + Me - 3 Me-
XI <<
The value A Me is the probability of getting at most 2 heads when 4 coins
are tossed.
The variables x considered so far in this chapter are called discrete
variables because they assume isolated values only. For instance, the
number of heads obtained when several coins are tossed is an integer 0,
1, 2, 3, ... (and cannot fill up an interval). The distribution of such a
variable is called a discrete distribution; it is defined for all values of x,
not only for the discrete set of possible values x k . One may also think of
the frequency function as being defined for all z, taking fix) — 0 for
values x other than the x k , (For example, the probability of getting 3.2
heads is zero.) The fact is that we may define fix) in any arbitrary fashion
for values other than the x k) provided some care is taken in the interpreta-
tion of the results. This possibility is exploited in the following discussion.
FUNDAMENTALS OF PROBABILITY THEORY
629
SEC. 6)
Graphical representation of the functions fix) and F(x) is given in
Pigs. 4 to 6. Figure 4 is valid as a probability for all x. The relationship
of fix) and F(x) is clarified, how-
ever, if f(x) is modified as shown in
Fig. 5. Here, the value of f(x) at
any integer m is used for f(x) in the
interval of length 1 centered about
m. The resulting step function still
gives the probability that x = Xk,
provided Xk is an integer. The ad-
vantage of redefining fix) in this
fashion rests upon the following
property, which is easily verified: If t is an integer, then Fit) is the area
under the curve of Fig . 5 up to the value x « t + x /i. For instance, the
area up to the value x ~ 2 l /i is found to be
m +fi 1) +/(2) - F(. 2)
Fig. 6
630
PROBABILITY
[CHAP, 8
by adding the areas of the shaded rectangles. When the values Zk are
equally spaced, similar considerations apply to any distribution and
frequency functions F and/. For unit spacing, that is, for Xk+i — %k * 1,
Eq. (5-1) expresses the fact that the area under the curve is 1.
Actually, it is possible to describe the relationship of / and F directly, without intro-
duction of the intermediate curve (Fig. 5) . The description involves the so-called Stieltjes
integral, which is now to be defined. Let F(t) be a nondecreasing function on an interval
ft < t <J &» and let 4>(t) be continuous. Choose a set of points fa, t a , . . tn on the interval
and choose intermediate values
fa ^ £/r ^ fa- j-1*
As the subdivision given by the fas is made finer and finer, in such a way that
max lfa +1 - fa I -► 0,
it can be shown that the expression
- F(fa) 1
tends to a limit (independent of the manner of subdivision and of the points £*)• The
limit is called the Stieltjes integral of <t> with respect to F and is written
\\{t) dF(t).
Ja
When F{t) is a discrete distribution corresponding to Xk and f(x), the function F(t)
has a jump of value /(j*) at each value Xk but is constant between those values. Hence
the differences
F(ti+ i) — F(tk)
behave much like the function exemplified in Fig. 4. They assume the value f(xk) if the
interval (fa,fa+i) contains a single point x and they assume the value 0 if the interval
contains no point xu- The relationship of / and F is now described by the equation
m - f dh\x)
where the integral is a Stieltjes integral. Although we have not defined a differential
dF, we may think of dF(x) as being equivalent to the frequency function f{x) in the
sense described above.
Example 1. In terms of the distribution function, express the probability that
a < x <b, where a and b are two numbers with a < b.
The event “x < b” can materialize in the mutually exclusive forms
x < a or a < x < b.
Hence, by total probability,
Pr (x < b) « Pr (x < a) -f Pr (a < x < b)
where Pr means “the probability that.” This yields the desired expression
Pr (a < x < 6) « F(b) - F(a)
when we recall that the distribution function F(t) satisfies
Pr {x <, 6) « F(b\ Pr (x £ a) m F( a ).
(W)
SEC. 6 ] FUNDAMENTALS OF PROBABILITY THEORY 631
Example 2. In terms of the frequency function /(x) the expectation of any variable
V - is
E{y) « 'Zykjixk), Vk « g(xk)> (5-6)
We consider the variable y to be defined on a sample space whose points are the n
events
X » Ji, X ** X2, . . , , X ** X».
The probability of the event “x * x*” is /(x*); the value of y corresponding to the event
“z * x*” is — 0(x*). Hence, (5-6) follows from the general definition of expectation.
PROBLEMS
1. Suppose a coin is tossed 5 times. What is the probability that this experiment
will yield 0, 1, 2, 3, 4, 5 heads?
2. If x is the number of heads in Prob. 1, make a table representing the frequency
function /(x). Plot f(x) and also the step-function modification (see Figs. 4 and 5)
3 . In Prob. 1 make a table and also a graph for the distiibution function F((),
4. If a coin is tossed 5 times, find the probability that the number of heads x satis-
fies 1 < x < 4 by use of (a) the frequency function /(r) computed in Prob. 2, (b) the
distribution function F(t) computed iri Prob 3, (r) the stop-function graph obtained m
Prob. 2, with reference to an appropriate area under the curve.
6. Continuous Distributions. Since measurements are made only to a
certain number of significant figures, the variables which arise as the result
of an experiment are discrete. For example, if the diameter of a shaft is
measured to the nearest 0.01 in., the measurement is a variable which
assumes only isolated values, such as 3.21, 3 22, 3.23, ... in. Nevertheless
it is convenient to introduce continuous variables, because they are easier 1
to handle analytically. Such variables are now to be discussed.
Let a point be chosen at random on the interval 0 < x < 3. How shall
we measure the probabilities associated with that event? If the interval
(0,1) is divided into a number of subintervals, each of length Ax = 0.1, then
the point x is equally likely to be in any of these subintervals (Fig. 7).
The probability that 2 0.5 < x < 0.8,
, . . _ _ . 1 I ♦— I ) ’« I
for example, is 0.3, since there are o 0.5 l
three favorable cases. The probabil- p ia 7
ity that 0.52 < x < 0.84 is found to
be 0.84 — 0.52 = 0.32 when we divide the interval into 100 parts, and so
on. This reasoning shows that the probability for x to be in a given sub-
interval of (0,1) is the length of that subinterval. If Pr stands for “the
1 This remark does not justify the use of continuous variables in applied mathematics.
The justification rests upon the fact that discrete variables can be approximated by
continuous ones within the experimental error.
* In this section it will not matter w hether the intervals include their end points or
not. Thus, Pr (a < x £ b) » Pr (a < x < b).
632 PROBABILITY [CHAP. S
probability that,” then
Pr (o < x < b) - b — a, 0 < a < b < 1. (6-1)
When (6-1) holds, the variable x is said to be uniformly distributed on
the interval 0 < x < 1. Since the expression (0-1) may be written
Pr (o < a- < 6) = f * dx = /* 1 dx, (6-2)
Ja Ja
it is customary to speak of the probability density , which in this case is
unit} .
More generally, a variable may be distributed with an arbitrary density
f(x). For such a variable the expression
m &
measures, approximately, the probability that a: is on the interval
t < x < t + At.
An exact expression for the probability that £ is on a given interval (a, b)
is 1
Pr (a < x < 6) ~ [ b f(x)dx. (6-3)
Ja
This relation is illustrated in Fig. 8.
As indicated above, the function
f(x) is called the probability density;
the function
m « f /(*> dx (6-4)
is called the distribution function.
Evidently, F(t) is the probability
that x is in the interval ( — »,/);
in other words,
F(t) = Pr (x<t). (6-5)
Fig. 8 If/M is continuous, then (6-4) gives
F'(t) « f(t)
and one may speak of a probability differential
dF(t) *= f(t) dt
} The symbol x in (6-3) is used in two different senses. On the left x is a random varia-
ble, and on the right x is the variable of integration. The integral could have been
written f f(Q d£, for example.
FUNDAMENTALS OF PROBABILITY THEORY
SEO. 0]
To find the distribution function t associated with the uniform density /(a?) on the
interval (0,1) we take
/(*) « 0, x<0 f
f(x) - 1 , 0 < » < 1 ,
f(x) - 0, x > 1.
This expresses the fact that x is sure to be in the interval (0,1), and is uniformly dis-
tiibuted on that interval Hence, for 0 < t < 1,
m - J‘ nx) dx
- f f{x) dx +J'f(x)dx
In a similar manner one obtains
~0 +
t.
(6-6)
Fit) * 0, i < 0,
Fit) - 1, t > 1,
which expresses the fact that x is never <0 but is always <1.
The following density functions arise in many applications.
Poisson :
e m* , 0 < x < <», n > 0, r — nonnegative integer,
r!
Gauss:
1 -h(z=*Y
— — € Vtr/ , — OO <J<OC jC r>0, — 00<^<QO ?
V 2tt <r
Maxwell-Boltzmann :
4 a
0 < x < oo, a > 0.
The random variable is x; the parameters r, a, <r are constants. For
example, in the Maxwell-Boltzmann distribution x is the magnitude of
the velocity of a gas molecule and a — m/2kT, where m is the mass, T
is the temperature, and k is called the Boltzmann constant. A graph of
the function for a » 1 is given in Fig. 8. The Poisson distribution is
discussed in Sec. 11; the Gauss distribution in Secs. 9, 10, and 12. The
latter is often called the normal distribution, but in this text the term
normal distribution is applied to the case o * 1, u « 0 only.
PROBABILITY
634
[chap. 8
Densities and distribution functions are easily defined for several vari-
ables. We say that f{x,y) is the probability density (or the joint proba-
bility density) for (x,y) if the probability that (x,y) is in any given region
R of the xy plane is
Pr l(x,y) in R] = jj R f(x,y) dx dy. (6-7)
The distribution function is
F(s,t) = / / f(x,y) dx dy
J — 00 J — GC
« Pr [x < 8 and y <
(6-8)
Since probabilities are nonnegative, the density functions in (6-3) and
(6-7) satisfy
fix) > 0, f{x >V ) > 0. (6-9)
Since the variables always have some finite value, in (6-3)
and in (6-7)
f(x) dx y
-00
n oo
f(x,y) dx dy = 1.
„ ~00
( 6 - 10 )
(6-11)
Any integrable function f(x) or fix.y) which satisfies these conditions
(6-9) to (6-11) may be regarded as a probability density. The sample
space is infinite; it consists of the events x = x 0 for every choice of x 0
or (z,y) = (,x 0 ,yo) for every choice of (x 0 ,y 0 ).
For example, if f(x,y) * l/A in a region R of area A and f(x,y) -* 0 elsewhere, it is
easily verified that (6-11) holds. The probability that ( z,y ) is in a subregion R\ con-
tained in R is
Jf R fi*>y) dx d y * Jf R \ dx d v *
where Ai is the area of R%. The variable (x,y) is then said to be uniformly distributed in R.
The theory for finite sample spaces applies with little change to con-
tinuous distributions; for example, the expectation is defined by
/ <*>
xfix) dx
E{x) « / rf{x) dx =
" / fix) dx
The latter expression follows from (6-10); it shows that E(x) is the x
coordinate of the center of mass for the area bounded by the curve y - f{x)
SEC. 6 ] FUNDAMENTALS OF PttOB ABILITY THEORY 635
and the x axis. More generally, the expected value of any function y « g{x)
is
f Vf(x) dx, y » g(x), ( 6 - 12 )
and the sum theorem E(x + y) ~ E(x) + E(y) is a simple consequence
of the properties of integrals. Compare Sec. 5, Example 2.
Two variables x, y are said to be independent if the joint density f(x,y)
has the form
/Cr,y) = f(x)g(y).
The theorem of compound probability for independent events
valid in the form
Pr (a < x < b, e < y < d) = Pr (a < x < b) Pr (c < y < d).
The theorem of total probability assumes various foitns, such as
Pr (a < x < c) ~ Pr (a < x < b) + Pr (b < x < c)
for a < b < c. Equation (6-15) is equivalent to
/ f(x) dx — f /(;r) dx + f f(x) dx
Ja Ja Jb
which, in turn, is a known property of integrals. 1
Example 1. A variable 1 x is said to be uniformly distributed on (a, b) if f(x) is constant
on (a, b) and zero outside (a y b). Find f(x) in this case.
Denoting the constant by c , we have
f(x) dx *■ / c dx — c(b — a) « 1
i Ja
by (6-10). Solving for c yields
/M * : » a <x <b t
b — a
f(x) * 0, elsewhere.
Example 2. A stick of length a is broken at random into two pieces. Find the dis-
tribution function F(s) for the length s of
the shorter piece. From this find the
probability density /(/) for the length l of
the longer piece.
Evidently 0 < s < a/2 in every case
For any t between 0 and a/2 we have
s < t if, and only if, x is on one of the
intervals (0,0 or (a — t , a) (see Fig. 9).
uniformly distributed, and hence
1 It is also possible to start with the theorem of total probability and deduce from this
that probability can be represented as a Stieltjes integral (Sec. 5). Mild continuity
conditions then give the representation (6-3).
mmm-
h — * —
V///////A
Fig. 9
The probability of that is 2 t/a i since x is
( 6 - 13 )
is then
( 6 - 14 )
( 6 - 15 )
PROBABILITY
[CHAP. 8
m -
a
°<*^-
m - o.
» < 0,
m - 1,
a
8> 2'
Since the length l of the longer piece satisfies l » a - s, we have l < t if, and only if,
« > a — t. By the result just obtained the probability is
1 — Pr (s < a —
0-1 2(0-0
a
for a/2 < t < a and 0 or 1 otherwise. This gives the distribution function for L By
differentiation, the density is found to be
*-!•
a
2 < 1 <
m - o,
(6-16)
elsewhere.
The differentiation is not valid for l ~ a/2 or for l ** a, but it does not matter how the
density is defined at these isolated points.
Example 3. A stick of length a is broken at random, and the longer piece is again
broken. What is the probability that the three segments can form a triangle?
Let l be the length of the longer piece. If this piece is broken at a point x , the three
segments are a — l, x, l — x. The condition for a triangle is that the sum of any two
segments shall exceed the third:
a — l + x > l — x t a — x > x, l > a — l.
Since l > a/2 automatically, these conditions reduce to
( 6 * 17 )
It is a conceptual aid (and not incorrect) to use the theorems of total and compound
probability in the following manner: The probability that l is on the interval (l, l + dl) is
a
by (6-16).
since x is
product
After l is chosen, the probability that x satisfies (6-17) is
f ' 2 i £
Jl-al 2 l
0/2 l dx - — *
uniformly distributed on (0,f). The probability of both these events is the
a -12
l a
dl
by compound probability, and total probability now gives the final answer:
SEC. 7] PROBABILITY AND RELATIVE FREQUENCY 637
Example 4. Buff on's Needle Problem, A needle of length a is dropped on a board
which is covered with parallel lines spaced
probability that the needle intersects one
of the lines?
We assume that the variables x and $
of the figure are uniformly distributed,
s being the distance from the center to
the nearest line. There is intersection if,
and only if, |(o/2) cos 0| > x. For fixed 6,
the probability of this is
| (a/2) cos 6 1 o 1 cos 6 1
6/2 ~ b
since x is uniformly distributed on (0,6/2).
Using total and compound probability as
in Example 3, we obtain the final answer:
2 * ajcoagj d$ a 1
b 2t " 6 2ir
a distance 6 > a (Fig. 10). What is the
r*' 2 2a
4 / cos 9 dd m .
Jo TO
PROBLEMS
1. A probability density is defined by f(x) « 3x 2 for 0 < x < 1 and f(x) « 0 else-
where. Find E(x) and #(x 2 ). Find the distribution function F(x), and from this obtain
a value m such that x is just as likely as not to exceed m. (The value tn is called the
median of x.)
2 . The radius of a sphere is uniformly distributed on (0,1). Find the expected value
of the volume (see (6-12)]. What is the probability that the volume exceeds half its
maximum value?
3 . A stick of length a is broken at random into two parts. What is the expected
length of the shorter part?
4 . Two points are chosen at random on a line of length a . What is the probability
that the three segments can form a triangle?
6. The probability density for bullets hitting a target is given by
2t Cgffp
where <r x , <r V) m Xt my are constant. Sketch the curves of constant density in the xy plane.
What kind of curves are they?
6. We make two independent observations xi, x 2 of a variable with distribution func-
tion }{x). What is the probability that a third independent observation j 3 will fall
between x\ and £ 2 ? Generalise to n observations. Hint; Use the methods of discrete
probability.
PROBABILITY AND RELATIVE FREQUENCY
7. Independent Trials. It often happens that the probability of an event
cannot be determined by counting cases or by other a priori considerations.
Sometimes the determination is impossible in principle; for instance,
638 PROBABILITY [CHAP. 8
one cannot compute the probabilities associated with a loaded die or the
probability that a given radio tube will fail in the first hundred hours’
use. Sometimes the determination is theoretically possible but impractical
For instance, by examining every nail in a 100-lb keg one could find the
probability that a nail selected at random will be defective, but this is
not a useful method.
In many such cases an estimate for the probability can be obtained by
repeated trials (or by inspecting a suitable sample , in the terminology of
statistics). In the case of a biased coin, for example, if 10 tosses give 7
heads, 100 tosses give 73 heads, and 1 ,000 tosses give 090 heads, it appears
that “the probability of heads is probably close to 0.7/’ The two italicized
words express a reservation which is always present in conclusions such
as this.
The figures 7, 73, 690 in the above discussion represent the frequency
of heads; the ratios
7/10, 73/100, 690/1,000
give the relative frequency in 10, 100, or 1,000 trials. More generally,
if an event occurs m times in n trials , the relative frequency is m/n ,
The trials in a sequence of trials are said to be independent if the proba-
bilities associated with a given trial do not depend on the results of pre-
ceding trials. For example, the probability of heads on a given toss of a
symmetric coin is no matter what is known about the results of previous
tosses. But if we try to get an ace by drawing cards one at a time without
replacing, the trials are dependent. In this case, the probability of ace
in a given trial depends on the number of aces that may have been drawn
previously.
When an event has constant probability p of success, the probability
of m successes in n independent trials may be computed as follows. A
sequence of m successes and n — m failures is represented by a sequence
of m letters S and n — m letters F:
SSFFSS . . . SF. (7-1)
Since the trials are independent, the probability of any one such sequence
is
ppqqpp . . . pq « p m q n ^ m } (7-2)
where q ** 1 — p. To obtain the number of favorable sequences, observe
that a sequence is determined as soon as the positions of the m letters S
are fixed. The m places for these letters S can be chosen from the n places
in n C m ways, and hence the required probability is
ptyi-m + p ™ q n~™ + . . . +
by the theorem of total probability.
( 7 - 3 )
PROBABILITY AND RELATIVE FREQUENCY
639
sec. 7]
Alternatively, the reader may imagine a sample space in which each event consists of
a sequence (7-1), with associated measure (7-2). Then (7-3) represents the sum of the
measures of those points favorable to the event: m successes.
Replacing m by x gives
B(x) - n C x p*q n ~* m — — — p*(l - p) n ~* (7-4)
x!(n ~ x ) !
for the probability of exactly x successes in n independent trials with
constant probability p . The associated distribution function is
F(t) « n Coq n + nOm"- 1 + • • • + n(Vg w ~‘ (7-5)
for integral values of t. This expression gives the probability of getting
at most t successes in n trials.
Because of its connection with the binomial theorem (Prob. 6), the
function B(x) is called the binomial frequency function, F(t) in (7-5) is the
binomial distribution , and the statement that B(x) gives the probability
of x successes in n independent trials is called the binomial law of proba-
bility. Since many statistical studies involve repeated trials, the binomial
law has great practical importance.
To illustrate the use of the formula (7-4) let it be required to find the
probability that the ace will appear exactly 4 times in the course of 10
throws of a die. Here p = Q ~ n — 10, x - 4. Hence the proba-
bility is
B( 4)
io! ziy/sy
4!G!\G/ W
0.05427.
Since the expected number of successes in one trial is p , the expected
number in n trials is N ^ .
L(x) - np (7-6)
(compare (5-2)]. For most distributions there is no special relation between
the expected value and the most probable value, but for the binomial dis-
tribution they happen to be almost equal. Equation (7-4) yields
B(x + 1) (n — x)p
B(x) (x + 1 )q
after slight simplification. Hence B(x) is an increasing function of the
integer x if, and only if,
(n — x)p
- — > 1 .
(x + 1 )q
The latter inequality is the same as
(n — x)p > (x + l)g
which reduces to np > x + q, since p + q * l. We have shown, then,
that B(x + 1) > B(x) as long as x < np — q but B(x + 1) < B(x) there-
640
PROBABILITY
(CHAP. 8
After. Since q < 1, this establishes that B(x) is maximum for a value of
x which is within 1 of the value x np. Further discussion of the func-
tion B(x) is given in the following sections.
Example 1 . Ten tosses of a suspected die gave the result 1 , 1 , 1 , 6 , 1 # 1 , 3, 1, 1, 4. What
is the probability of at least this many aces if the die is true?
The event “at least 7 aces” can materialize in four mutually exclusive ways: 7 aces,
8 aces, 9 aces, 10 aces. By total probability (or by use of the distribution function) the
required answer is found to be
B( 7) + B(8) 4* B( 9) + B(10)
~ i oC 7 (K ) 7 (%) 3 + xo C 8 (K ) 8 (^) 2 + + xo C 10 (M ) 10
when we take p » K, n « 10. This reduces to 0.00027, approximately. Because the
observed result has such small probability, one would reject the hypothesis tl p *»
unless there is some other evidence in its favor.
Example 2. In Example 1 let p be the unknown probability of the ace in a toss of
the die. (a) For what value of p does the expected number of aces agree with the ob-
served number? ( b ) For what value of p is the probability of the observed result a maxi-
mum?
Since E(x) -* np by (7-6), the observed and expected numbers agree when p *•» x/n,
that is, when p » 0.7. The estimate for p given by p — x/n is called an unbiased esti-
mate, because E{x/n) « p.
For part (b), the probability of getting 7 aces and 3 other numbers is
p 1 q 8 or igCtpV, 0-1 ~ P,
depending upon whether the order is considered or not. In either case the probability
is maximum when p 7 (l — p) 3 is maximum. This, in turn, is maximum when
log p 7 (l - p) 3 » 7 log p + 3 log (1 - p)
is maximum. Differentiation gives
- _ _JL_ _ o,
p I - p
or p *» 0.7. An estimate for p such as this, which maximizes the probability of the ob-
served result, is called a maximum likelihood estimate.
PROBLEMS
1. When 5 coins are tossed what is the probability of exactly 2 heads? At least 2
heads? What is the expected number of heads? The most probable number of heads?
2. If 6 dice are tossed simultaneously, what is the probability that (a) exactly 3 of
them turn the ace up? {b) At least 3 turn the ace up?
3 . If the probability that a man aged sixty will live to be seventy is 0.65, what is the
probability that out of 10 men now sixty at least 7 will live to be seventy?
4 . A man is promised $1 for each ace in excess of 1 that appears in 6 consecutive
throws of a die. What is the value of his expectation?
5. A bag contains 20 black balls and 15 white balls. What is the chance that at
least 4 in a sample of 5 balls are black?
0 . (a) By use of the binomial theorem show that
BBC. 8]
641
PROBABILITY AND RELATIVE FREQUENCY
[q «f pf) n - B( 0) + £(1)* + B(2)fi + * - + B{n)t\
( b ) Interpret the h entity which arises when t « 1. (c) Differentiate with respect to f,
and interpret the identity which then arises for t «• 1. [The function (g 4- p0 n is
called the generating function of the sequence j£(x) } .]
7. (a) One hundred light bulbs were tested for 500 hr, at the end of which time 57
bulbs had failed. Obtain an unbiased estimate and also a maximum-likelihood esti-
mate for the probability of failure in 500 hr. (6) Are these two estimates of p always
equal for the binomial distribution? Hint: In (5), compare the result of maximizing
p m q n ~ m with respect to p and the result of choosing p so that E(x) ■» m, where m is the
number of observed successes.
8. In a certain agricultural experiment, the probability that a plant will have yellow
flowers is If 10,000 plants are grown, what is the probability that the number with
yellow flowers will be between 7,400 and 7,000? (To appreciate later developments ob-
serve that your answer, which should be indicated only, is difficult to compute.)
8. An Illustration. Some interesting conclusions concerning the bino-
mial law are suggested by an example that presents many features of the
general case. Consider a purse in which are placed 2 silver and 3 gold
coins, and let it be required to find the probability of drawing exactly x
silver coins in n trials, the coin being replaced after each drawing. The
probability of exactly x successes in n trials is given by (7-4) where p,
the probability of drawing a silver coin in a single trial, is %. If the
number of drawings is taken as n — 5, 10, or 30, the respective frequency
functions B(x) are
B(x) - 6 C x (H) x (H) s ~ x ,
3
11
Ol
B(t) = ,0 c x (%) x (K) 10 -*,
n = 10,
B{x) = 3 oC x (H) z (H) M ~ x ,
n — 30.
By use of these expressions one can compute the values of B(x) to any desired accu-
racy. The result of such a computation to four places of decimals is presented in the
accompanying tables. In the third table the entry 0.0000 is made for 0 < x < 2 and
for x >23 because in these cases B(x) was found to be less than 0.00005. For example,
the probability of drawing exactly 23 silver coins in 30 trials is
£(23) - «C„($$)“(?$) 7 - 0.000040128.
The reader can verify that the most probable values of z are exactly equal to np
(and not merely within 1 of np). This behavior is always found when np is an integer.
Probability of Exactly x Successes in 5 Trials
X
B(x)
X
B(x)
0
0.0778
3
0.2304
1
0.2592
4
0.0768
2
o.$m
5
0.0102
642
PKOB ABILITY
[CHAP. 8
Pbobabiutt of Exactly x Successes in 10 Trials
X
B(x)
X
B(t)
X
B(x)
0
0.0060
4
0.2508
8
0.0106
1
0.0403
5
0.2007
9
0.0016
2
0.1209
6
0 1115
10
0.0001
3
0.2150
7
0.0425
:
Probability of Exactly x Successes in 30 Trials
X i
B(x)
X
B{x)
X j
Hix)
<2
0.0000
9
0.0823
16
0.0489
3
0.0003
10
0.1152
17
0 0269
4
0.0012
11
0.1396
18
| 0 0129
5
0.0041
12
OA474
19
0.0054
6
0.0115
13
0.1360
20
0.0020
7
0.0263
14
0.1100
21
0.0006
8
I 0 0505
15
0.0783
22
0.0002
>23
0.0000
The values given in the tables are presented graphically in Fig. 11 after
the manner described in Sec. 5. Each curve has the general shape pre-
dicted by the theory of the preceding section, but the figure shows also
how the shape changes as we proceed from one curve to another. The
numerical area under each curve is 1 , although the curves become broader
and flatter as n increases. In particular the maximum (that is, the proba-
bility of the most probable value) decreases as n increases. This is just
what one would expect intuitively. (For instance, one could easily get
2 heads in 4 tosses of a coin, but one would be surprised to get exactly
500,001 heads in 1,000,002 tosses.) The fact that the curves become
broader indicates that the values of x experience a wider spread when
there are more trials, and this, too, one would expect. Naturally, the
curves ought to get broader if the maximum is to decrease while the area
remains equal to L
The foregoing discussion is concerned with the frequency of success in
n trials. The results are very different if, instead, one considers the relative
ffequency x/n. The distribution for the variable x/n is presented graphi-
cally in Fig. 12. These curves were obtained from the preceding by the
change of scale indicated on the axes, and hence, the area is still 1. Instead
of becoming broader, these curves become narrower as n increases. The
PROBABILITY AND RELATIVE FREQUENCY
643
SEC. 8]
relative frequency x/n tends to cluster about its expected value p as n
gets large. It is for this reason that relative frequency can be used to
estimate an unknown probability.
Fig 11
The behavior suggested by this example may be summarized as follows.
When the number of trials n becomes large, the absolute deviation from
the expected value
\x — np\ = | x — E(x) |
inB(x)
Fig. 12
is likely also to be large, but the relative deviation
x — np
ss
X
p
XK
X ( x\
- - E -)
n
n
n viz
is likely to be sipalh 1
1 It will be seen in Sec. 9 that the first expression is usually of the order \/n and the
second, of order \/y/n\ compare Prob. 3.
644
PROBABILITY
[CHAP. 8
PROBLEMS
1. Plot a distribution curve like that of Fig 1 1 for the probability of x successes in 4
trials when p » %, Shade the area corresponding to the event 1 < x < 3, and find
the probability of this event.
2 . For plot the probability of the most probable number of successes versus to.
(Take points at w « 1, 2, 3, 4, 5410, 30 only, cf. Prob, 1 and accompanying tables.)
On the same figure plo t 1 /-y /2 irnpq versus to. (It is shown in Sec. 9 that the probability
is asymptotic to \/y/2vnpq when n is large. This expression appioachos zero as to *-> *>,
even though we are considering the most probable value )
3. Using the tables and your numerical values in Prob. 2, plot y/n B(x) versus
(x — np)/y/n for p * % and for « = 3, 4, 5, 10. Use the same scale in each case.
Formulate a conjecture concerning the behavior as to ~+ and test your conjecture
by plotting five well-chosen points on the curve corresponding to n ** 30.
9. The Laplace-de Moivre Limit Theorem. Numerical computation of
the binomial distribution is difficult when n is large. In this section an
approximate formula is obtained when n and np are both large. In Sec. 11
a formula is found when n is large but np is not large. These approxima-
tions, together with the exact formula when n is moderate, cover all cases.
The analysis is based on the Stirling formula ,
n! ~ n n e~ n \^2rn, (9-1)
which is made plausible by the following dis-
cussion. Consider the function y = log#,
and observe that for k > 2,
f
h - 1
log x dx> ^[log (As — 1) + log k),
log#.
since the right-hand member represents the
trapezoidal area formed by the chord (Fig. 13)
joining the points P and Q on the curve
Denote the area between the chord and the curve by so that
r log x dx
-i
HHog {k - 1) 4 log As) 4- a*.
(9-2)
Setting k — 2, 3, . . . , to in (9-2) and adding give
J log x dx * H(log l -f log 2) -f H ( log 2 4 log 3) -4
4 J 4 [log (n — 1) 4* log n] 4 (os 4 as 4 h o»).
Integrating the left-hand member and combining the terms of the right-hand member give
»
n log n - n 4“ 1 « log n! - H log n 4 X) a*.
t **2
log n! - (n 4 H) log to - n 4 1 — X) (H .
i-2
Hence,
(9*3)
645
SBC. 9] PROBABILITY AND RELATIVE FREQUENCY
Since each 04 is positive, it follows that
log n! < (n H) log n — n -f 1
and hence
n! < eVn^e"*, (9-4)
The expression on the right of the inequality (9-4) is, therefore, an upper bound for n!.
To get a lower bound, solve (9-2) for a*, perform the integration, and obtain
Now, since the integrand is nonnegative,
r (i-iy
Jk - 1 Vx k/
dx > 0
and the evaluation of (9-6) leads to the formula
log-
2k
k — 1 2 k(k - 1)
By use of this inequality, (9-5) gives
1
die <
4 k{k - 1) 4 1
— 5'<if("D + G-3*-+(.-iT-i)K-
-(-± — i).
4 \k - 1 k/
<m>
(9-6)
By means of this result and (9-3), one obtains
log n! > (n -f Vi) log n — n + 1 —
whence n! > e*A a / n n n e~ n . (9-7)
Combining (9-4) and (9-7) furnishes the inequality 1
f^Vn rt n c*" n < «* < cVn n n e~ n
for all values of n > 1. Since e «® 2 718, e* 4 = 2 1 17, and «= 2.507, we have shown
that (9-1) is correct as to ordei of magnitude. More refined methods establish that the
error is less than 10 per cent for n > 1, less than l per cent for n > 10, and less than
0.1 per cent for n > 100. Moreover, the percentage error approaches zero as n -+ oo,
so that the equality is asymptotic.
In the expression
B (r) = — PY~ T (9-8)
r!(n — r)l
for the probability of r successes in n independent trials, we assume that
r, n, and n — r are large enough to permit the use of Stirling’s formula
1 The derivation of this result is given by P. M. Hummel, Am. Math. Monthly , 47:97
(1940).
646
PROBABILITY
[CHAP. 8
(9-1). Replacing n!, r!, and (n — r)! by their approximations gives, after
simplification,
r)
(9-9)
Let $ denote the deviation of r from the expected value np; that is,
5 = r — np .
n — r — nq — 8
(np-f5) / i)
nq)
Then,
and (9-9) becomes 1
B(r)
or
where
^L w (i + i)(i - ±)
i+— ) (l )
np/ \ nq/
\ np/ \ ntf/
Then, log £(r)A ~ — (np + S) log (l H ) — (nq — 6) log ( 1 ~
\ np/ \ nq/
Assuming 1 6 1 < npq f so that
5
< i
and
np
nq
< 1,
permits one to write the two convergent series
and
Hence,
log (l H \ =
\ np/
log(l~-) =
\ nq/
b
np
+
S 3
2 n*p 2 3 n 6 p
3^3
nq,
log B(r)A ^ —
5 r
n<? 2n 2 g 2 3n 3 <? 3
JL _ g3 (p 2 ~ g 2 )
2npq 2-3n 2 p 2 q 2
<V + 9 S )
3 • 4n 3 p 3 9 3
Now, if 1 5 1 is so small in comparison with npq that one can neglect all
terms in this expansion beyond the first and can replace A by \^2mpq,
1 Here and in similar eases which arise subsequently, we assume that p & 0 and
q s* 0. The cases p 0 or p « 1 can be dealt with by inspection.
647
SBC. 9 ] PROBABILITY AND RELATIVE FREQUENCY
then there results the approximate formula
B(r)
— e
— !*/2 npq
( 9 - 10 )
\/ 2tt npq
which bears the name of Laplace’s, or the normal, approximation . With
cr = V npq , Eq. (9-10) becomes
1
B(r):
■\/ 2 t a
(9-11)
The equality is asymptotic; that is, the ratio of the two sides tends to
1 as n — * co. A comparison of B(r) with the normal approximation is
given in Fig. 14.
The main usefulness of this result is to compute the probability
E *00 0-12)
r«r i
that the number of successes is between the given limits r x and r 2 . Equa-
tion (9-11) shows that the sum (9-12) may be approximated by a sum
S — Lr- e-**' 2 ** (9-13)
V2ir a
over appropriate values of 8. Since 5 — r — np, the difference between
successive values of 8 is 1, and hence if \\e let t — 8/ a, the difference be-
tween successive values of i is At = 1/ cr. Thus (9-13) becomes a sum over t,
\/2r
-t*f 2
AL
(9-14)
PROBABILITY
648 PROBABILITY [CHAP. 8
As At 0, the expression (9-14) approaches an integral, which may be
evaluated in terms of the function
1
* (< > ={
e -t*(2 dt
(9-15)
tabulated in Appendix D. These considerations yield the following funda-
mental result, known as the Laplace-de Moivre limit theorem :
Theorem. Let x be the number of successes in n independent trials with
constant probability p. Then the probability of the inequality
x — np
t\ ^ < (g
‘ Vnpq
approaches the limit
~ /%-**» eft - *(t 2 ) - Hh)
\/2ir J t i
as n — > oo.
(9-16)
(9-17)
To complete the proof one must note that the error in passing from (9-12) to (9-13)
ifl small for large n even when the number of terms in the sum is large A more de-
tailed analysis, taking due account of this question, is given in William Feller, “Proba-
bility Theory and Its Applications/' pp. 1 33—137, John Wiley Sons, Inc., New York,
1950, It is shown that a better approximation is given by
Vnpg,
(9-18)
although the improvement is not important when n is large An expression for the
error in the approximation is derived in J V. Uspensky, “Introduction to Mathematical
Probability/' p. 129, McGraw-Hill Book Company, Inc , New York, 1937
To illustrate the use of the result (9-17), let us find the probability that
the number of aces will be between 80 and 110 when a true die is tossed
600 times. Here n — 600, p - Y§, q = and x varies from 80 to 110.
Hence
80 - 100
t X = — t t-tt == •■ r=r = —2.19 and
v(100)(%)
110 - 100
h ~ “U(iooW)
1.09.
The table gives $(f 2 ) - 4>(1.09) ~ 0.362, and similarly
4>(— 2.19) * -$(2.19) = -0.486.
[Observe that 4>( — t) - — $(/), wince the curve y = e “** s/2 is symmetric.]
Hence the required probability is, approximately,
0.362 - (-0.486) - 0.848.
Example 1. In the notation of the text, the probability P inax of the most probable
value of r satisfies
i P max ' N “ /
whan n is large.
y/2rnpq
(9-19)
SEC. 9] PROBABILITY AND RELATIVE FREQUENCY 649
In Sec. 7 it was found that the most probable value is of the form
r «* np -f 6, |0| < L
For this value of r we have
h » r — np » 0
and hence Eq, (9-10) shows tlrnt the associated probability is asymptotic to
. g— 4* i'lnpq^
y/2irnpq
As n oo, the exponential tends to 1, since is bounded, and this yields (9-19).
Example 2. In an agricultural experiment Mendelian theory yields a probability
p * that any given plant should have blue flowers. Out of 10,000 plants it was
found that 2,578 had blue flowers. Does this result contradict the theory?
According to theory, there should have been 2,500 plants with blue flowers; that is,
the expected number is np » 2,500. There were, in fact, 78 more than this. We have
to decide if this excess is too large to be attributed to chance.
Let us find the probability that the excess will be 78 or more if the hypothesis p m
is indeed correct. The inequality
78 < x - np (9-20)
becomes
when divided by y/ npq
r — nv
1.801 < < oo (9-21)
V npq
43.3. According to the table the probability of (9-21) is
#(*>) - <£(1 .801) « 0 500 - 0 464 * 0.036.
Now, in a statistical tost it is customary to reject the hypothesis if the hypothesis makes
the probability of the observed result levs than a fixed quantity a determined before-
hand The value a (which is called the significance level of the test) is of km taken to be
0 05. Since our probability 0 036 is less than 0 05, the experimental outcome is con-
sidered too unlikely to be attributed to chance, and we reject the hypothesis (( p =»
In this sense, the experiment contradicts Mendelian theory.
We now give another analysis which leads to the opposite conclusion. Instead of
saying ‘The excess was 78,” one could just as w r ell say, “the discrepancy w as 78,” meaning
\x — np| 78. (9-22)
Both statements are equally valid descriptions of the experimental outcome. The
probability of
\x — ?ip| < 78
is found, as above, to be
#(1.801) - #(-1.801) - 0.928,
and hence the probability of the contrary event is
1 - 0 928 - 0.072,
Since 0.072 > 0.05, a discrepancy of ‘78 or more” is sufficiently probable to be at-
tributed to chance (if, as before, our significance level is 0.05). Hence the hypothesis
is not contradicted by the experiment. 1
1 When the probability exceeds the significance level, as in this case, the hypothesis is
not thereby proved but it is considered to have withstood the experimental test.
650
PROBABILITY
[CHAP. 8
It requires statistical methods of considerable subtlety to decide between competing
tests of a hypothesis such as the foregoing. These methods show that the first procedure
is appropriate for testing the hypothesis “p — against the alternative u p > l /i"
whereas the second is appropriate for testing the hypothesis against the alternative
u p & A very readable account of the subject is given In P« G. Hoel, “Introduction
to Mathematical Statistics,” chap. 10, John Wiley & Sons, Inc., New York, 1954.
PROBLEMS
1. Two dice are tossed 1,000 times. What is, approximately, the probability of get-
ting a sum of 4 the most probable number of times? 500 times? (l T se a table of ex-
ponentials.)
2 . A true coin is to be tossed 1,600 times, and it is desired to find the probability that
the number of heads x will satisfy 780 < x < 830. (a) Show that this inequality is
equivalent to
-1
<
x — np
y/n'pq
< 1.5.
(6) Express the probability of the latter inequality m terms of 4> by means of the normal
law. (c) Using the table, evaluate the probability.
3 . By means of the normal law, obtain an approximate numerical answer to Prob. 8,
Sec. 7.
4. A machine has a probability p «= 0 01 of producing a defective bottle In oik*
day's run, out of 10,000 bottles, 120 were defective, h ind the approximate probability
of at least this many defectives if the machine is running as usual
5. A suspected die gave only 960 aces m 6,000 tosses. If the die is true, (a) w hat is
the probability of getting at most 960 aces in 6,000 tosses? (b) What is the probability
of getting a discrepancy \x — np\ of “40 or more”? (c) At a significance level of 0.05,
does either calculation indicate that the die is loaded?
10. The Law of Large Numbers. Since 2 B(r) = 1 for each value of n,
it is natural to expect, by the foregoing analysis, that
1
V2?
1.
(10-1)
For a direct proof of (10-1), define I by
I = r e-^dx = f e^'Uy.
J — 00 J —00
( 10 - 2 )
Then multiplication of the two expressions (10-2) yields
J 2 = f f X e-^^'Uxdy (10-3)
J — ao J — oo
after changing to a double integral. In polar coordinates,
n oo /*qo
e~ T ‘ l2 r dr d6 = 2w e ~ r ‘ 12 d (r 2 / 2) =* 2r (10-4)
SEC, 10] PROBABILITY AND RELATIVE FREQUENCY 651
so that I = -\/2tr, and (10-1) follows. The transformations leading to
(10-3) and (10-4) are justified by the fact that (10-3) is an absolutely con-
vergent double integral.
Equation (10-1) shows that the function
1
-\/2t
r e~ x * /2 dx « *(*) + -
*' — 00 9
is a distribution function; it is called the normal distribution . The theorem
of the preceding section asserts that the variable h/cr is approximately
normally distributed when n is large. This fact will now be used to es-
tablish the following fundamental result, which is a special case of the
so-called law of large t lumbers :
Theorem. Let x be the number of successes in n independent trials with
constant probability p . If e is any positive number y then the probability of the
inequality
I x I
- - p
n
< €
(10-5)
tends to 1 as n — » oo.
In other words, the relative frequency of the event is almost sure to be
close to the probability of the event when the number of trials is large.
For proof, write the inequality (10-5) in the form
which becomes
x — np
< < e
n
( 10 - 6 )
when multiplied by \A/ pq Given any number t {) (no matter how large),
we can choose v so that t y/n’pq > f 0 . In this case the probability of the
inequality (10-6) is at least equal to the probability of
As n
x — np
— *o < 7~ — ~ ^ ^o*
V npq
oo, the latter probability 1 tends to
1
y/Vitr
(10-7)
( 10 - 8 )
by (9-17). Since to is as large as we please, Eq. (10-1) shows that the in-
tegral (10-8) is as close to 1 as we please, and this completes the proof.
1 One must not apply (9-17) directly to (10-6), because (9-17) was obtained only for
fixed t\ and h whereas the limits in (10-0) depend on n.
602
PROBABILITY
(chap. 8
The theorem was established first by James Bernoulli (1654-1705)
after 20 years of effort. The law of large numbers lies at the basis erf all
attempts to estimate a probability experimentally, and it affords a phil-
osophical justification for such attempts. In fact, some developments of
the subject define probability in terms of relative frequency, by the formula
p = lim (x/n) as n oo, and rely on the law of large numbers to ensure
that the limit exists.
The theorem makes possible some interesting computational procedures,
known as Monte Carlo methods. Although the method is not to be discussed
at length here, we sketch an example that illustrates some of the main
features. Suppose a man walks in a straight line, taking a step of length
h ft every $ sec (see Fig 15). Each step is equally likely to be to the right
h
Fig. 15
or to the left, without regard to the preceding steps. Assuming that x
is a multiple of h and t is a multiple of s, it is required to find the probability
that the man is x ft from his starting point at time t.
Let U(x 7 t) stand for the probability in question; that is, U(x,t) is the
probability of the man's being at point x at time t if he was at point x = 0
at time t — 0. Now, he can arrive at point x at time t + s in two ways.
Either he was at point x + h at time f and took a step to the left, or he
was at point x — h at time t and took a step to the right. The probability
of being at x + h at time t is U(x + h, t) by the definition of £/, and the
probability of a step to the left is Jd? by hypothesis. Hence the probability
of both events is
y 2 u(x + h 9 1)
by compound probability. In just the same way the probability of being
at x — h and then stepping to the right is
y 2 u(x - h, t).
By total probability, the probability of getting to the point x at time
t 4* s is the sum, and we are thus led to a difference equation for U,
U(x, t + s) = yu(x + h,t) + yu(x - h f t). (10-9)
The boundary conditions are
U(x t 0) = 0 for x 0, X) U(x f t) « 1 (10-9a)
X
which express the fact that he is sure to be at the origin when t « 0 and
sure to be at some point x for all t
probability and relative frequency
SEC* 10]
633
To apply the Monte Carlo method to this problem, we make a large
number of actual random walks experimentally. The number of times
we arrive at point x at time t gives an estimate for the probability U(x,t)
by virtue of the law of large numbers. Hence, the calculation yields an
approximate solution of the problem (10-9) without any direct use of
(10-9). In practice, the “random walks” are made on a computing machine
by reference to a set of random numbers. Similar methods apply to
difference equations of much greater complexity than (10-9).
For readers familiar with the theory of heat conduction the foregoing example yields
an interesting interpretation 1 of the normal law. Subtracting U(x,t) from both Bides
of (10-9) and dividing by s give
U(x, t+s)~ U(x t t) h 2 * 4 U(x 4- h, t) - 2l)(x,t) 4- U(x - k , t)
« ~2sL > J
If we set s » h? and let h
with boundary condition
f/(j,0)
0, this becomes, formally,*
— 1 a2?/
dt “ 2 ax 2
o
for
f U(x,t)
J QO
dx
I.
(10-10)
(10-lOa)
Since these are the conditions for an instantaneous source of heat at the origin, a solution
is *
V(x } t)
1
(10-11)
Now, in the random walk the probability of a steps to the right and b to the left is
given approximately by the normal approximation (9-10); it turns out to be
4
ir(a + b )
( a _ fe )2/2(«+fe)
( 10 - 12 )
If the man arrives at point x at time t, he makes t/s steps altogether and x/h more steps
to the right than to the left:
a 4 ~ b
Substitution in (10-12) and setting s ** h 2 yield
1
\/ 2irt
(10-13)
for the probability. Here 2 h is the distance between possible values of x when t is fixed,
and hence the coefficient of 2 h may be regarded as a probability density. The condition
1 Since heat is due to random motion of the molecules, the analogy of the random-walk
problem with the problem of heat flow has a physical basis as well as the mathematical
basis outlined in the text.
4 See Chap. 6, Sec. 26.
1 See Chap. 6, See. 19.
654 PROBABILITY [CHAP. 8
“h small 1 ’ means simply that the number of steps is large, so that the normal law is ap-
plicable. The analogy between (10-11) and (10-13) is evident.
The discussion shows that not only (10-9) but the problem in heat flow given by
(10-10) may be attacked by making random walks. Some of the main applications of
Monte Carlo methods are, in fact, to the study of partial differential equations.
Example: A true coin is tossed repeatedly. It is desired to have a probability of 0.99
that the relative frequency of heads shall be within 1 per cent of the probability of
heads. How many times must the coin be tossed?
If the coin is tossed n times, the desired inequality is
which m the same as
- 0.01 yj'-
Setting the probability of (10-15) equal to 0 99 and noting that p * q, we get
0.99 - *(0 0lVn) - ♦(-OOlV'n) 833 2<J>(0.01 y/n)
by the normal approximation. The table gives
0 01 Vn = 2.58,
so that n 07,000 approximately. The fact that a problem such as this will always
yield a finite value for n is the essential content of the law of large numbers. Applying
the law of large numbers in another fashion, we ran interpret the result more or less as
follows: If the whole coin-tossing experiment is repeated a great many times, in about
99 per cent of these experiments the inequality (10-14) w ill be verified.
PROBLEMS
1. In the Example of the text, how many times must we toss the coin to make the
probability 0.95 that the relative frequency is within 5 per cent of the probability?
2. On the average a certain student is able to solve 60 per cent of the problems as-
signed to him. If an examination contains 8 problems and a minimum of 5 problems
is required for passing, what is the student’s chance of passing? Hint Because of the
law of large numbers, you may take the statement about the student’s average per-
formance to mean: “His probability of solving any given problem is 0,6.”
8. If Paul hits a target 80 times out of 100 on the average and John hits it 90 times
out of 100, what is the probability that at least one of them hits the target when they
shoot simultaneously?
4 . If on the average in a shipment of 10 cases of certain goods 1 case is damaged, what
is the probability that out of 5 cases expected at least 4 will not be damaged?
ADDITIONAL TOPICS IN PROBABILITY
11. The Poisson Law. In the problem of repeated trials it may happen
that p is too small to permit the use of the normal approximation even
though n is large. A different approximation, which is called the Poisson
law or the law of small numbers , is now to be obtained for this case.
ADDITIONAL TOPICS IN PKOBABILITY
655
SBC* 11]
Starting with the formula for the probability of r successes in n trials,
B(r) p r (l — p) w ~ r ,
r\(n ~~ r):
we replace n! and (n — r)! by their Stirling approximations to obtain
B(r)
n v e
-"V2™
r1(n
r ) n ~ T e-(n~ T '>y/2r{n - r)
V T { 1 “ P)"‘
n e
r![l - (r/n)]"- r+> *
- *)*-'•
(11-1)
Since the expected value of r is np, we can assume that r is small compared
with n. In this case 1
(-0
Similarly, since p is small,
(1 - p)"“ r ^ (1 - p) n - [(1 - ^ e“ np .
Substituting these two expressions into (11-1) yields the desired law of
small numbers:
(np) r __
£(r) ~ c np , n large, np moderate. (11-2)
r!
The result may be written
B(r) s ~ e"*, (11-3)
r!
where m = np is the expected number of successes.
An application of this law to some specific cases may prove interesting.
Suppose it is known that, on the average, in a large city 2 persons die
daily of tuberculosis. What is the probability that x persons will die on
any day? In this case the expected number of deaths is /i = 2, so that
2 *
B{x) =-e~ 2 .
x\
1 The reader is reminded that lim (1 -f h) lfh ■» e as h approaches zero through posi-
tive or negative values. See I. S. Sokolnikoff, “Advanced Calculus,” pp. 28-31, McGraw-
Hill Book Company, Inc., New York, 1939.
656 PROBABILITY
Therefore we have the following table:
[crap. 8
—
X
B{x)
X
B(x)
X
B{x)
0
0.135
2
0.271
4
0.090
1
0.271
3
0.180
5
0.036
The Poisson law has a significance far beyond its connection with the
binomial distribution, as will now be shown. Suppose points x t are dis-
tributed at random on the x axis in such a fashion that the following
assumptions are valid :
1. The probability that a given number of points is in a given interval
depends only on the length of that interval (and not on any information
we may have about the points in adjacent intervals).
2. If P( Ax) is the probability of 2 or more points in an interval of
length Ax, then P( Ax)/ Ax —» 0 as Ax 0.
3. If Pi(Ax) is the probability of 1 point in an interval of length Ax,
then P\{ Ax)/ Ax — ► k, a constant, as Ax 0.
In tins case the probability P n (x) of w points in an interval of length x
satisfies the Poisson law
, (^) n
P n (x) = (11-4)
n!
To prove this result, consider an interval (0, x 4- Ax) of length x -f Ax. We can
have n points in this interval in three mutually exclusive ways. Either there are n
points in x and none in Ax, or there are n — I in x and 1 in Ax, or there are fewer than
n — 1 in z and at least 2 in Ax. The probability of this last alternative may be written
c Ax, where « — ► 0 with Ax, in view of assumption 2.
Thus, by total and compound probability,
P*(x -f Ax) « P»(x)P 0 (Ax) 4 - P n „ 1 (x)Pi(Ax) 4- « Ax.
Subtracting P n (x) from both sides and dividing by Ax give
Pn(x 4- Ax) - F n (x)
Ax
Pn(x)
Po(Ax) — 1
Ax
4“ P n~l(x)
P i(Ax)
Ax
+ «.
(11-5)
Since there must be no point, 1 point, or more than 1 point in an interval of length
Ax, We have
Po(Ax) 4- Pi(Ax) 4- P( Ax) - 1
which gives
Po(Ax) - 1 Pi(Ax) P(Ax)
Ax
Ax
Ax
( 11 - 6 )
SEC* 11} ADDITIONAL TOPICS IN PROBABILITY 657
Taking the limit 0, we obtain —k in (11-6), and hence taking the limit in (11-6)
gives
£ PJx) - -kP n (x) + fcP„-i(x), n > 1. (11-7)
For n « 0 the term P n ~i(x) is to be replaced by zero, so that
~ Po(x) - -kPo(x).
ax
This separable differential equation yields
P 0 (x) « ce~ kx m e ~ k *
where the constant c « 1 since Po(0) ** 1 ; that is, an interval of zero length is sure to
contain no points. (This follows from assumption 2.)
Substituting Po(x) in the relation (1 1-7) for n **■ 1 we get
£ Pi{x) - -fcP,(x) + ke~ k *
which yields P\(x) * e~**(fcr). Proceeding step by step or using mathematical induc-
tion, wc obtain (11-4).
The following are some of the phenomena which satisfy the assumptions
1 to 3 quite accurately and which, accordingly, obey a Poisson law: the
distribution of automobiles on a highway, the distribution of starting
times for telephone calls, the clicks of a Geiger counter, the arrival times
for customers at a theater ticket office. The first example is a spatial
distribution, while the last three refer to distributions in time.
Example 1. What is the probability that the ace of spades will be drawn from a deck
of cards at least once in 104 consecutive trials?
This problem can be solved with the aid of the exact law (7-4) as follows: The proba-
bility that the ace will not be drawn in the 104 trials is
B( 0) « mCo(H2)\ 5 K2) m - 0.133
and the probability that the ace will be drawn at least once is 1 — 0.133 «* 0.867. On the
other hand, Poisson’s law (11-2) gives for the probability of failure to draw the ace
B(0) - „ e -s,
0!
Hence, the probability of drawing at least one ace of spades is 1 — e~ 2 ** 0.865.
Example 2. Show that the constant k in the Poisson law (11-4) represents the ex-
pected number of points in a unit interval.
Since the probability of n points in a unit interval is
Pn(l)
OO
E(n) - £ e~ k
n— 1
—
n!
n
- e~ k k £
k n-i
(n - 1)1
- - k.
the expected number is
058
PROBABILITY
[CHAP. 8
PROBLEMS
1. By use of the Poisson law compute the probability of (a) just one ace in 6 tosses of
a die, (b) just one double ace in 36 tosses of a pair of dice. Compare the binomial law
for cases (a) and (5). Which of the two cases satisfies the assumptions of the text more
exactly?
2. The probability is 0.0025 that a nail chosen at random from the output of a cer-
tain machine will be defective. What is the probability that a keg of 1,000 nails made
by the machine will have at most 3 defective nails? Hint: The keg has "at most 3” if
it has 0, 1, 2, or 3 exactly. Use the Poisson approximation.
3. In Prob. 2 it is desired to have a probability of at least 0.95 that the keg has at
least 1,000 good nails. How many nails should the manufacturer put into the keg?
Hint: If he puts in n « 1,000 4- m nails, he wants a probability 0 95 that the number
of defective nails will be at most m. Use the Poisson law, taking np ™ l,000p ** 2.5.
4. On a certain one-way highway it is proposed to install a traffic signal which has a
60-sec red interval but a long green interval. The speed of the cars may be taken as
30 mph, and the expected number is 10 cars per mile of highway. Neglecting any
effects of slowing down, find the probability that just n cars will be obliged to stop
when the light is red. What is the probability that at most 5 cars must stop? What
is the expected number that must stop? Hint Assume that the cars are distributed
according to the law (11-4), and see Example 2.
6. A certain circuit can transmit 3 telephone calls simultaneously The expected
number of incoming calls is 1 per minute, and each call lasts 3 min. What is the proba-
bility of getting a busy signal? Hint: You will find the line busy if 3 calls or more have
come in during the preceding 3-min interval. Use (11-4).
12. The Theory of Errors. In this section the methods of probability
are used to analyze the effect of experimental errors in measurement. If
n independent measurements give the values m u m 2> . . ., m M , we consider
questions such as the following: What is the best estimate for the quantity
being measured as determined by these measurements? What is the
probability that this best estimate is within 1 per cent, say, of the true
value? How much added precision is gained by increasing the number
of measurements?
Proceeding to the first question, let mi and m 2 be two independent
measurements of an unknown quantity m (such as the mass of an electron,
for instance). It is desired to find a best estimate for m based on the
measurements mi and m 2 . To this end we denote the best estimate by
0(mi,m 2 ) and seek to determine the function 0. Now, if both measure-
ments are increased by a given amount a, it seems reasonable to assume
that the estimate also increases by the amount a. In symbols,
0(mi + a y m 2 + a) * 0(mi,m 2 ) + a. (12-1)
This relation is now postulated.
Similarly, if mi and m 2 are multiplied by a fixed quantity it is reason-
able to suppose that the best estimate is likewise multiplied by j8. This
requirement leads to
BEC. 12]
ADDITIONAL TOPICS IN PROBABILITY
659
B(Pm h Pm 2 ) « pe(m h m 2 ) ( 12 - 2 )
which is also postulated. [Equation (12-2) is quite obvious when we con-
sider the effect of a change of units. For instance if grams are used in-
stead of kilograms as the unit of mass, we expect the estimate in grams to
be 1,000 times as great as the estimate in kilograms,]
Finally, since the two experiments are carried out under substantially
identical conditions, it does not matter which experimental result is m x
and which is m 2 . We are thus led to postulate that 0 is symmetric:
0(rai,m 2 ) = 0(m 2} mi).
(12-3)
It is a remarkable fact that the best estimate is wholly determined by
these requirements; if 6 satisfies (12-1) to (12-3), then 0 must be the arithmetic
mean ,
0(mi,m 2 )
mi + m 2
* 2
( 12 - 4 )
To establish (12-4), regard mi and m 2 as fixed and choose a — —m 2 in (12-1). There
results
0(mi,m 2 ) =* m 2 + 0(mi — m 2 , 0). (12-5)
If this expression for 0(mi,m 2 ) is used in the left-hand member of (12-2), one obtains
/5m 2 + 0(/3mi - 0 m 2 , 0) = 00(mi,m 2 ). (12-6)
Whenever mi 5 * m 2 , the choice /3 ~ l/(mi — m 2 ) in (12-6) gives
m 2 H- 0(l,O)(mi — m 2 ) ~ 0(mi,tn%) (12-7)
if we multiply through by mi — m 2 And now (12-3) leads to
m 2 -f 0(l,O)(mi — m 2 ) — mi + 0(l,O)(m* — mi)
which implies 0(1,0) « Y 2 , Hence (12-7) yields (12-4). The case mi m 2 is even sim-
pler; specifically, Eq. (12-5) gives
d(m h mi) ** mi + 0(0,0) (12-8)
and the choice /9 « 0 in (12-2) shows that 0(0,0) =» 0.
By analogy with (12-4), one generally assumes that the best value for
three or more measurements is also the arithmetic mean. Thus,
mi + m 2 + m 3
0(m 1 ,m 2 ,m 8 ) * ~
(12-9)
We shall now use this assumption to determine the underlying probability
distribution for the errors of measurement.
Let the true value of the quantity being measured be denoted by t;.
The errors, then, are
x % ~ mi — v . ( 12 - 10 )
660
PROBABILITY
[chap. 8
Since the experimental determinations are made under substantially
identical conditions, these random variables are all assumed to have the
same probability density f(x). And since the experiments are supposed
to be independent, the joint density for two or three variables is given
by the product: 1
f(x u x 2 ) «/Cri)/(* 2 ) (12-11)
f(x i,x 2 ,x 3 ) « f(xi)f(x 2 )f(x s ). (12-1 la)
Our task is to determine the function f{x).
Now, v is the true value of the quantity being measured. It is not a
random variable, and it is not at the disposal of the experimenter. Never-
theless, one can contemplate the effect of a change in v, and in particular,
one can consider that value of v which would maximize the probability of
the observed result. We now postulate that the value of v which maximizes
this probability is the arithmetic mean of the measurements. In other
words, the best estimate , (12-4) and (12-9), is assumed to be also a maximum-
likelihood estimate. It will be found that this assumption 2 enables us to
determine the form of the function / without any knowledge of the experi-
mental process.
If the probability (12-11) is maximum when
v - -i-— A (12-12)
then the logarithm of the probability is also maximum. Thus
log /(mi -v) + log/(m 2 - v) + log/(m 3 - v) (12-13)
is maximum, as a function of v, when (12-12) holds. Setting the derivative
with respect to v equal to zero in (12-13), we obtain
/'(?% - v) f'(m 2 - v) /'(m 3 - v) = q
f(mi - v) f(m 2 - v ) f(m a - v)
If F is defined by
1 If we think of the errors as being discrete variables with / the frequency function,
(12-11) is simply the law of compound probability for independent events. That is,
the probability of making an error xi in the first experiment and X 2 in the second is the
product of the individual probabilities. The corresponding result for continuous varia-
bles and densities (stated in Sec. 6) is also a consequence of the theorem of compound
probability. The notion of independence is discussed further in Sec. 13.
* We shall suppose also that / is positive and twice differentiable, though these require-
ments could be somewhat relaxed.
SlffiC. 12] ADDITIONAL TOPICS IN PROBABILITY 661
the foregoing result, in the notation (12-10), is
F{x i) + F(x 2) + F(x 3) *» 0. (12-14)
Equation (12-10) shows that (12-12) is equivalent to
xi + x 2 + ~ 0. (12-15)
Thus, (12-14) bolds whenever (12-15) holds. The corresponding statement
for two variables, obtained from (12-11), is that
F(x 1) + F(x 2 ) - 0 (12-16)
whenever X\ + x 2 = 0, and for one variable, we have
F(xi) = 0 when x x « 0. (,12-17)
From (12-16) we get —F(x 3 ) ~ F(-~x 3 ) by choosing = X3, x 2 - — x 3 ,
and hence (12-14) gives
F(x l ) 4- F(x 2 ) = ~F(x 3 ) » F(-x 8 ).
Since —x 3 = X\ + x 2 by (12-15), the function F satisfies
F(x x) + F(x 2 ) = F(xx + :r 2 ).
Differentiating partially with respect to Xi and x 2 leads to
F'(x 1) » F f (x 1 + x 2 ) and F'(x 2 ) = F'fci + x 2 ).
Hence F'(x 1) = F'(x 2 ). Holding x 2 constant, we see that F'(x 1) is constant:
F'(s,) « c
and hence F(xi)
cx i, since (12-17) gives F(0) — 0.
/'(*)
fix)
F(x) « cx
The relation
yields /(x) = Ke Hcx *
where the constant X may be found from
l ~ f f(x) dx — K f dx.
J — co • —00
Since the integral diverges if c > 0, we set c =■ —2ft 2 to obtain
1
K
by (10-1). Hence K
h/y/r, and
/(*)
Vi
-\/2jr
V2ft
(12-18)
662
PROBABILITY
(CHAP. 8
This result, known as the Gaussian law of error , states that the variable
y/2 hx is normally distributed. Specifically, the probability of
h < \/2 hx < t 2 (12-19)
h ) h
/ ^
Jiu
IS
~h*x*
dx
( 12 - 20 )
h /(V2M y/ir
by (12-18), and the change of variable t = y/2hx shows that (12-20) is
-4= /'• = s>(; 2 ) - «>«!). (12-21)
v 2 tt
The most important consideration justifying the use of this analysis in
practice is that systematic errors must be eliminated.
The constant h measures the accuracy of the observer and is known as
the precision constant . That particular error which has probability Y to
be exceeded in magnitude is called the probable error; it is found to be
0.4769//t by use of (12-19), (12-21), and Appendix D. Another interpreta-
tion of the constant h is afforded by considering the mean-absolute error
r* 2h r<*> , , 1 0.5642
25(1*1) = / |*l/(*) dx « ~~ I xe- h * xl dx = — — - = — — (12-22)
■'—00 \/ 7T fl\/' w h
and still a third interpretation is given by the mean-square error
/ <*> 2h /■<*> „ , 1
x 2 f(x) dx = — / arV* v dx - — -• (12-23)
-00 V 7T •'O 2 h 4
The final question mentioned at the beginning of this section concerns
the effect of increasing the number of measurements n. Since x % = m x — v,
we have
x ~ ffl — v
where the bar denotes the arithmetic mean:
x = - 2x„ rn = - 2m,*.
n n
Thus, the error in the mean is the mean of the errors. It is likely to be
smaller than the error in a single measurement because positive and nega-
tive errors tend to cancel when we form 2x,. For the Gaussian distribution
(12-18) the situation is especially simple; namely, x has a Gaussian distribu-
tion with precision constant hy/n, whenever the independent measurements
Xi have Gaussian distributions with precision constant h. Thus, if the in-
equality | x | < a has probability p, then the inequality \$ | < a/y/n has
the same probability p, This result shows how much more precision is
attained by increasing the number of measurements.
ADDITIONAL TOPICS IN PROBABILITY
SBC. 13]
The proof is omitted because it involves a tedious evaluation of multiple integrals.'
However, the essential meaning of the result is that the “scatter” or “spread” for £ is
1 /\/n times as great as the corresponding spread for x. When interpreted in this fashion
the property follows from the results established in Sec. 14.
PROBLEMS
1. (a) Show that the sum of the squares of the errors 2(m t — v) 7 is least if the true
value v happens to be the arithmetic mean of the measurements m,. ( b ) Deduce that
the arithmetic mean m is a maximum-likelihood estimate for v when there are n inde-
pendent measurements each satisfying (12-18). Hint: It is required to choose v so that
/(*!,** X B ) - /(xO/fe) .../(X») -
is maximum. Use the result (a).
2. In a certain experiment which satisfies the conditions of the text, the probable
error is 0.01. A measurement mi is about to be made. What is the probability that
the interval (mi — 0.02, mi -f 0.02) will contain the true value v ? Hint: First find A,
then note that the stated result happens if, and only if, \x\ \ < 0.02.
13. Variance, Covariance, and Correlation. Two random variables
x and y are said to be independent if the event x — x t and the event y = y 3
are independent events for each choice of r t in the range of x and each y,
in the range of y. In other words, knowledge that y has a particular value
must not influence the probabilities associated with x. The numbers
shown on two successive tosses of a die are independent in this sense
(and so were the measurements m t considered in the last section). On
the other hand, the number of heads in the first three tosses and in the
first four tosses of a coin are dependent variables.
The product xy of two random variables is a random variable which
equals x l y } when x — x t and y = y r Although it is not usually true that
the expectation of a product is the product of the expectations, this is the
case when the variables are independent. In symbols,
E(xy) « E(x)E(y), x , y independent. (13-1)
The proof is simple. If pi is the probability that x = x t1 and if q is the
probability that y = y J} then the assumed independence gives p^j for the
probability that simultaneously x = x, and y = y v Hence
E(xy) = ZZpiQjXiyj = (Sp,x,)(2^) = E(x)E(y).
* See J. V. Uspensky, “Introduction to Mathematical Probability,” chap. 13, McGraw-
Hill Book Company, Inc., 1937, for a direct verification. An indirect method based on
the theory of moments is given in P. G. Hoel, “Introduction to Mathematical Statistics,”
sec. 6.4, John Wiley <fc Sons, Inc., New York, 1954. See also M. E. Munroe, “The
Theory of Probability,” pp. 91-96, McGraw-Hill Book Company, Inc., New York, 1951,
664 PROBABILITY [CHAP. 8
When a discussion involves several variables x, y, . . . , it is convenient
to denote expectations by the letter m, with a subscript to indicate the
variable. Thus, we write
E(x) « Mx, E(y) = fiy
and so on. For example, (13-1) in this notation takes the form
Mxy ® MxMy, x, y independent. (13-2)
To measure the deviation of a variable from its expected value y, one
introduces a quantity a defined by 1
a = Ve(x - m) 2 or a 2 = E(x - ju) 2 . (13-3)
The expression a is called the standard deviation , and its square a 2 is called
the variance . As for here, too, it is customary to use a subscript when
several variables have to be distinguished. For example,
al = E (: r - Mi) 2 ) 4 = FAy - m„) 2 -
To illustrato the calculation of a variance by means of the definition, let x denote the
number of heads obtained when 3 coins are tossed. Since n - E{x) — ^ we have the
following table:
x
0
1
i
2 i
3
X - fl =
-H
-H
H
H
(x - -
%
H
H
%
Probability «
Ys
Ys
H
K
The definition of expectation now gives
* 2 * B(X - M) 2 - Vs'H 4- H-H + H-H + H H »
If £(x) = nx and E(y) ~ My, the quantity
^xy “ E(x Mx)(2/ My) (13-4)
is called the covariance of # and y. The covariance is a generalization of
the variance, in that the special case y ~ x gives
<4 - «(* - Mx)(* - Mx) - E(x - Mx) 2 - crj.
As an illustration, let us compute when x is the number of heads obt&inod on the
first 2 tosses and y the number obtained altogether in 3 tosses of an unbiased coin.
1 The intent is l£f(® — m) 2 ], not [E{x — m)J*.
SEC. 13] ADDITIONAL TOPICS IN PROBABILITY 665
Here /»« >« 1, % % so that we have the following table:
Event
HHH
HHT
HTH
HTT
THH
THT
TTH
TTT
X ~~ Hz
1
1
0
0
0
0
-1
-1
V - Mv
X
X
X
-A
X
-X
-X
-X
Product
H
V
0
0
0
0
V
X
Since the associated probabilities are }/%, we take times the sum of the entries in the
last row to get
<4 - H. (13-5)
We shall now obtain an expression for a xy which is often more useful
than (13-4). Expanding the product in (13-4) gives
aly = E(xy — yn x — xy y + n x n y )
= E(xy) — E(y)n x - E(x)y v +
Upon recalling that E(x) = y x and E(y) « y. y we get
°iv = E(xy) - E(x)E(y) * M;ey - Ma;My , (13-6)
which is the required formula.
To apply this formula to the preceding example, we construct the following table:
Event .
HHH
HHT
HTH
HTT
THH
THT
TTH
TTT
x
2
2
1
1
1 i
1
y
3
2
2
1
2
1
1
0
xy
6
4
2
1
2
1
0
Taking % times the sum of the last entries gives E(xy) «* 2, and hence by (13-6)
<4 - 2 - ( 1 )(«) * Yi.
The special case x = y in (13-6) gives an alternative form 1 of (13-3),
namely,
cr 2 - E(x 2 ) - m 2 = «(**) - lE(x)] 2 . (13-7)
As an illustration the reader may apply this formula to the preceding
example to obtain
4 - X - O) 2 - X, 4 “ 3 - (%) 2 - (13-8)
1 Note that <r 2 gives the moment of inertia of the area under the distribution curve
y =* f{x) about the line x ** y which passes through the center of mass. From this
viewpoint (13-7) is the familiar formula for moment of inertia after a change of rotational
axes.
f
666
PROBABILITY
{chap, 8
If the variables x and y are independent, (13-2) and (13-6) give a xv « 0.
Hence when <r xy & 0, the variables must be related. A quantitative
measure of the strength of the relationship is given by the correlation
coefficient p:
, - A. (1W»
VxOy
For example, in the foregoing illustration (13-5) and (13-8) yield
M _
VK vh
ivt
0.816.
(13-10)
Thus, if two variables x and y have a correlation coefficient p = 0.8, then
they are about as strongly related as are the numbers of heads on the first
two tosses and on the first three tosses of an unbiased coin.
The correlation coefficient has the value 1 if y « x, and, as we have already observed,
p ** 0 when x and y are unrelated. Moreover, p does not change if x and y are each
multiplied by a constant factor. Thus, if the correlation coefficient indicates a certain
strength of relationship for x and y, it will give the same strength of relationship for 2x
and 3 y. Similarly, p is unaffected by addition of a constant; for instance, x — 2 and
y — 3 have the same p as x and y.
In spite of having these desirable properties, p is not always a reliable measure of
dependence, and many statistical studies have led to erroneous conclusions through an
incorrect interpretation of correlation. It is quite possible to have the variables so
strongly related that y is a function of x and yet p * 0. Before a correlation coefficient
can be used with confidence, one must know something about the underlying probability
distribution.
The variables x and y are said to have a bivariate normal distribution when
f(x,y) m eW+itev+cit+dx+cv+f), conBt
In this important case the theory of correlation has been fully developed, and it is
found 1 that p actually does measure the strength of the relationship between x and y.
Example: A variable x is said to be “normally distributed with mean a and variance
tr 2 " when its density function is
i _ y
/(x) ** —y=— t 2 ^ 0 ' , p, <r const.
V 2ir a
Show that the mean is indeed y and the variance <r 2 .
By the definition of expectation,
*<*dt
when we set t — (x — y)/e. Hence E{x - y) « 0, which gives E(x) - p.
change of variable leads to
E(x
- „) J - -^L j°° Pe-H* dt -
-0
The same
1 See Hoel, op, cU. t chap. 8.
ADDITIONAL TOPICS IN PROBABILITY
SNC. 14]
667
aa we see upon integrating by parte and using (10-1). Since j u «■ #(x), the latter result
E(x ~ p) 2 is the variance by definition.
Choosing ^ 0, <r « l/(A\/2), we obtain (12-18), and hence the precision constant h
is given by
*-vb as -“>
[cf. (12-23)]. This fact gives a method for estimating h from the data, as we shall see
in the following sections.
PROBLEMS
1. Compute <r 2 if x is uniformly distributed on the interval 0 < x < 1.
2. Let x be the number on top and y the number on the bottom in a toss of a true
die. Compute E(x), E(y), E(xy) f and the covariance. Does your work indicate that
the variables are dependent? Find the correlation coefficient.
8. Three coins are tossed. Let x be the number of heads shown by the first coin,
whereas y is the number of heads shown by all the coins. Compute the correlation
coefficient. Your result should be smaller than the value (13-10) Why?
14. Arithmetic Means. In many applications one does not consider a
single variable, but rather one obtains the mean of a large number of
variables. For instance, if x is a measure of the length of a rod, one would
make several measurements x { , r 2 , . . ., x n and use the arithmetic mean,
_ *1 + * r 2 H h * r n , , v
(14* i)
n
in accordance with the procedure of Sec 12. Here the .r,s are not the
different values of a single variable but are n random variables describing
the result of n independent measurements.
Just as one uses <t x to indicate the standard deviation of the variable z,
it is customary to let cr f denote the standard deviation of x. The following
theorem enables us to compute <r f from a r x in many cases:
Theorem. If the variables x t are independent, if they have the same ex-
pectation E(x t ) ~ p and the same variance a 2 , then
- - ik (14 - 2)
For proof, observe that
E(x x 4 — * + x n ) = E(x i) 4 — *4- E(x n ) = np.
The variance of x x 4 b x n is therefore
E(x i 4 b x n - np) 2 >
which may be written
E[(Xi — p) 4~ tea ~ p) 4 1- ten — m)] 2 .
PKOBABILITT
[chap. 8
Expanding the bracket we obtain
E [22 (*> ~ m) 2 + ]£ (*< - »){xj - p) J • (14-3)
Since the variables are independent, the covariance of x, and Xj is zero
for * j; that is,
B(xi - n)(xj - n) » 0 .
Also the definition of <r* gives
a\ = E(Xi — m ) 2 -
Hence, taking the expectation in (14-3) yields
E(x j 4 f x„ — nil) 2 = n<r 2 .
Dividing by n 2 we have
r^ L± _ ±i _
L n J n
which gives (14-2) upon taking the square root.
The intuitive meaning of this result is approximately as follows: Suppose
a single measurement varies over an interval of length / about the true
value, so that l measures the scatter or spread. Then the mean of n in-
dependent measurements will have a spread of the order of 1/ y/ n about
the true value.
To illustrate the use of (14-2) let x x = 1 if there is success at the zth
trial in a set of independent trials with probability p> and let x x = 0 other-
wise. For each variable x x we have x? — x x and hence
E(xf) « E(x l ) ~pl + q<0~p.
By (13-7) the corresponding variance is
<rl = p - p 2 = p(l - p) = pq
and (14-2) now gives
For the variables x x considered in the foregoing paragraph the mean $ is simply the
relative frequency tn/n, where m is the number of successes. We have, then,
which shows again that the relative frequency m/n is likely to be close to p when n is
large. The corresponding result for a general variable * is based on ( 14 - 2 ); it leads to
ADDITIONAL TOPICS IN PROBABILITY
SRC. IS]
assertions concerning |l£(ap) — f | which are similar to the theorem established hi Sec.
10 but of greater scope.
Multiplying (14-4) through by « we get
[E(m — np) 2 ]** • y/rvpq.
This gives an interpretation for the quantity ■%/ npq that arose in connection with the
normal law (Sec. 10) ; namely, \/npq is the standard deviation of the number of successes m.
15. Estimation of the Variance. If x u x 2 , . . x n are n independent
observations of a variable x, the sample variance is defined by
s 2 - ~ - *) 2 - (Xi - £) 2 . (15-1)
n
Unlike the theoretical variance cr 2 , the sample variance is computed from
the observations, hence is actually available. It will be seen, now, that
s 2 can be used to estimate cr 2 .
W.e have
E(ns 2 ) - 2E(xi - x) 2
- m(xi - M> - - m )] 2
= 2[E(xi - ti) 2 - 2 E(xi - h)(x-m) + E(£ - M ) 2 ]. (15-2)
Now, E(xi — M ) 2 — <r 2 by definition, and E(x — /i) 2 = <r 2 /n by (14-2).
For the middle term in (15-2) we get
1
E(xi - p)(£ ~ m) = ~ E(x t - m) ( a:i H b x n - pn)
n
= - E(x< - „)(*< - M +•••)= - 2?(x, - M ) 2 = - tr 2
n n n
when we note that the terms not written explicitly are independent of x».
That is, for i ^ j , Eq. (13-1) gives
E[(x* — p)xj] - E{xi — v)E(xj) - 0-m » 0.
Substituting into (15-2) yields the important formula
Bins 2 ) = (n - l)<r. (15-3)
If (15-3) is divided by n, we get
£[(^1^! - " tr 2 (15-4)
71
upon recalling (15-1). On the other hand the definition of <r s gives
£[(*. -m) ! ] - a 2 .
(15-5)
870
PROBABILITY
[CHAP, 8
It k not surprising that (15-4) gives a smaller value than (15*5), inasmuch as the choice
ft -* £ is the value of a* that minimizes (15-5) (cf. Prob. 1, Sec. 12), The fact that (15-4)
should be smaller than (15-5) is especially clear when there is only one measurement, z\.
In this case (15-4) gives zero because x\ ~ £.
The foregoing remarks indicate that s 2 is not a suitable estimate of o 2 ;
it has a tendency to be too small. But if we divide (15-3) by n — 1 for
n > 2, we get
which gives the following theorem :
Theorem. Let Xi,X2, . . . , x n be n independent observations of a variable x ,
with n > 2. If s 2 is the sample variance , then the quantity
6 2 » — s 2 (15-6)
n ~ 1
is an unbiased estimate of a 2 . That is, E(d 2 ) - a 2 .
To illustrate the use of the theorem, let
mi = 12, m 2 " 8, vi i =» 13
be three measurements of an unknown quantity whose true value is v . The errors in
the measurement are X{ » m % — v, but since
Xi — i « m t — v — ffi -f v « mi — m (15-7)
we can compute a 2 without knowing v. By (15-1) and (15-7),
ns 2 = £(x, - x) 2 » S(m,' — m) a .
In this example fh 11, so that
ns 2 * (l) 2 + (~3) 2 + (2) 2 * 14.
Hence an estimate for <r 2 is
According to (13-11) the precision constant h is estimated as h gg l/(\/2 6) = l/\/l4
m 0.27. In statistics it is shown how one can determine the reliability of an estimate
such as this, though we do not pursue the subject here. 1
PROBLEMS
1, A certain experiment gave the measurements
m, - 17, 21, 20, 18, 14.
Obtain an unbiased estimate for the variance of a single measurement, and from this,
estimate the precision constant.
1 See Hoel, op. cil. t chap. 10.
1
SEC. 15] ADDITIONAL. TOPICS IN PROBABILITY 67 1
2. If the precision constant in Prob. 1 can be assumed exactly equal to your estimate
of it, (a) what is the probability that the next measurement will be within 0.5 of the
true value? ( b ) How many measurements must you make if you want a probability
0.95 that the mean of those measurements will be within 0.1 of the true value? Hint:
Use the fact that the precision constant of the mean is h\/n if that of a single measure-
ment is h.
3. In a certain measuring routine the cost of equipment and materials is negligible
but the time required is proportional to the number of measurements. Give a rational
method of adjusting the salaries of two observers whose working speeds are #i and b% if
the precision constants of their measurements are h\ and h*. Hint: Consider the number
of measurements each must make to attain equal reliability in the respective arithmetic
means.
4. Discuss Prob. 3 if the cost of equipment is proportional to the length of time it is
used and the cost of material is proportional to the number of measurements.
CHAPTER 9
NUMERICAL ANALYSIS
Solution of Equations
1. Graphical Methods 677
2. Simple Iterative Methods 679
3. Newton's Method 684
4. Systems of Linear Equations. The Gauss Reduction 687
5. An Iterative Method for Systems of Linear Equations 689
Interpolation. Empirical Formulas. Least Squares
6. Differences 691
7. Polynomial Representation of Data 694
8. Newton's Interpolation Formulas 696
9. Lagrange's Interpolation Formula 699
10. Empirical Formulas 701
1 1 . The Method of Least Squares 702
12. Harmonic Analysis 711
Numerical Integration of Differential Equations
13. Numerical Integration 715
14. Euler's Polygonal Curves 721
15. The Adams Method 723
16. Equations of Higher Order. Systems of Equations 727
17. Boundary-value Problems 730
18. Characteristic-value Problems 731
19. Method of Finite Differences 734
675
The principal concern of numerical analysis is with the construction of
effective methods for the calculation of unknowns entering in the formula-
tion of a given problem. Since every formulation of a practical problem
involves assumptions and approximations, it is senseless to seek unknowns
to a higher precision than is warranted by the initial data. A simple and
perhaps crude technique giving the desired values within specified limits
of tolerance is always to be preferred to an involved method capable of
yielding an arbitrary degree of accuracy.
In recent years the growth of numerical analysis was accelerated by the
demands of science and technology for numerical solutions of many pressing
problems. High-speed computing machines produced for coping with
such problems are certain to open new vistas in science and leave a pro-
found imprint in all fields of human activity.
It is the object of this chapter to present the rudiments of numerical
analysis essential to all concerned with the processing of numerical data.
Inasmuch as the understanding of principles must precede the acquisition
of computing skills, the emphasis in the following sections is placed on
basic ideas and general methods rather than on special techniques useful
in solving this or that problem. Among topics included here are the
determination of real roots of algebraic and transcendental equations,
the basic method for solving systems of linear equations, the elements of
interpolation theory, and its bearing on curve fitting and numerical so-
lution of differential equations.
SOLUTION OP EQUATIONS
1. Graphical Methods. Geometric considerations usually are a useful
guide in the construction of analytic methods of solution of practical
problems. This is particularly true in the problem of determination of
numerical values of the roots of algebraic and transcendental equations . 1
1 A polynomial equation x n + aix n ~~ l H f a* » 0 is called an algebraic equation.
An equation F(x) *= 0 which is not reducible to an algebraic equation is called Iramcen*
dental . Thus, tans — x * 0 is a transcendental equation, and so is e x -f 2 coax *■ 0.
677
078
NUMERICAL ANALYSIS
[chap. 9
If Fix) is a real continuous function, the equation
Fix) - 0 (1-1)
may have real roots. The approximate values of such roots can be de-
termined by graphing the function y = F(x) and reading from the graph
the values of * for which y = 0. This familiar procedure for graphical
determination of real roots can frequently be simplified by rewriting (1-1)
in the form
m - g(x). (1-2)
The abscissas of points of intersection of the curves y = fix) and y = g(x)
will obviously be the roots of (1-2).
Thus, an approximate value of the real root of
Fix) m x 3 - 146.25* - 682.5 * 0
can be found by graphing the function
y = * 3 - 146.25* - 682.5.
It is simpler, however, to plot the cubic
y ** * 8
and the straight line (Fig. 1)
y « 146.25* + 682.5
and read off from the graph the abscissa of their point of intersection P 0 .
An obvious disadvantage of graphical
methods is that they require plotting curves
on a large scale when a high degree of accu-
racy is desired. To avoid this, one obtains
more precise values by applying one of the
several methods of successive approxima-
tions discussed in Secs. 2 and 3. All these
methods require that the desired root be
first isolated. That is, they call for the
determination of an interval which contains
just the root in question and no others.
If Fix) is a continuous function, and if for
a certain pair of real values * «* *i, * « * 2 ,
* the signs of F{x{) and F(*a) are opposite,
then it is obvious that Fix) * 0 has at least
one real root in the interval (*i,* 2 ). If there
are several roots in (*i,* 2 ), one usually nar-
rows down this interval by a succession of judicious trials until an interval
is obtained which contains just the desired root. For efficient applies-
SEC, 2] SOLUTION OF EQUATIONS 679
tion of the successive-approximations methods it is desirable that this
interval be as small as possible.
We note in passing that no general methods are available for the exact
determination of the roots of transcendental equations. Also, there are
no algebraic formulas for the solution of general algebraic equations of
degree higher than 4. The so-called Cardan and Ferrari solutions of the
cubic and quartic equations require the calculation of cube roots of quan-
tities which themselves are square roots. Generally it is simpler to obtain
the desired approximations by methods described in the following sections
than to make use of Cardan's formulas. 1
PROBLEMS
1. Find graphically, correct to one decimal, the real roots of :
(a) 2* - x 2 - 0; (b) x* - x - 1 - 0; (c) x 5 - x ~ 0.5 - 0; (d) e* + x - 0;
(c) tan t—z=0, v < x < 3tt/2.
Isolate the roots (that is, for each root find an interval which contains just that root
and no others).
2. A sphere 2 ft in diameter is made of wood whose specific gravity is %. Find to
one-decimal accuracy the depth h to which the sphere sinks m water. Hint: The volume
of a spherical segment is vh\r — h/Z). The volume of the submerged segment is equal
to the volume of displaced water, which must weigh as much as the sphere. If water
weighs 62 5 lb per ft 3 ,
62.5 -I*-.?- 62.5.
and since r ■» 1, we have h 3 — 3 h 2 + % » 0.
2. Simple Iterative Methods. When real roots of Eq. (1-1) have been
isolated, there are many methods for computing them to any degree of
accuracy. These all depend on the application of some iterative formula
which furnishes values of the succeeding approximations from the preced-
ing ones. The nature of restrictions imposed on the function F(x) in the
equation
F(x) - 0 (2-1)
in the two basic iterative methods discussed here is obvious from the
description of the methods. The simplest of these is the method of linear
interpolation , also known as the method of false position.
Let the root x 0 of (2-1) be isolated between x x and x 2 - Then, in the
1 A numerical determination of the roots of algebraic equations is frequently accom-
plished by some method of synthetic division (such as Horner's method) or by the root-
squaring method (Graeffe’s method). These special methods are discussed in many
books. See, for example, F. B. Hildebrand, “Introduction to Numerical Analysis,”
McGraw-Hill Book Company, Inc., New York, 3956. The methods of Secs. 2 and 3
of this chapter apply to all types of equations and are generally adequate for the deter-
mination of real roots.
680
NUMERICAL ANALYSIS
{chap. 9
interval (x h x 2 ), the graph of y #* F(x) may have the appearance shown
in Fig, 2. If the points P\ and P 2 in Fig. 2 are joined by a straight line,
it will cut the x axis at some point
x 3, which usually is closer to the
root x 0 than either X\ or x 2 . But
from similar triangles,
~F(*i)
3*2 - *3
F(x 2 )
and on solving for we get
XiF(x 2 ) - x 2 F(x l )
^*3
(2-2)
(2-3)
F(X 2) - F(.n)
To obtain a (‘loser approximation
to ,t 0 , v\e can determine the x inter-
cept of the straight line joining the
point P$ in Fig 2 with the point P 2
and thus obtain the next approximation ,r 4 . By repeating this process ve
obtain a sequence of values
*r3> » « * ? Xn }
which generally converges to rr () . The process described here is precisely
that used in interpolating tabulated values of logarithms and other func-
tions. In effect, it replaces a small portion of the curve by a straight line*.
Another useful iterative method is based on rewriting (2-1) in the form
Now, if the real roots of
/« - gb).
fix) = c
(2-4)
can be determined for every real c, we can proceed as follows. Let x x be
an approximate value of the root x (J of (2-1). This, of course, is also an
approximate root of (2-4), since (2-1) and (2-4) are equivalent equations
On setting x *= xi in the right-hand member of (2-4) we get the equation
f(x) = g(: ri), (2-5)
which by hypothesis w r e can solve. If the solution of (2-5) is x 2 , we obtain,
on setting x ~ x 2 in the right-hand member of (2-4),
fix) « g(x 2 ). (2-6)
The solution £3 of (2-6) we call the third approximation, and in general,
^ the nth approximation x n is determined by solving
f(x) * g{x n „ x ).
(2-7)
«BC. 2] SOLUTION OF EQUATIONS 681
From the geometric interpretations of this procedure, which we give
next, it will be seen that the sequence x X) x 2y * . . , x n , ... converges to the
root x 0 of (2-1) if, in the interval of length 2\xx — x 0 j centered at x 0 , we
have
!/'(*) I > W(*)\ (2-8)
and the derivatives are bounded.
Suppose, first, that the slopes of the curves
y * /(«), y - g(x) (2-9)
in the interval (x 0 ,xi) (Fig. 3) have the same sign and satisfy (2-8). When
x = xi is taken as the first approximation to x 0 , Eq. (2-5) yields the second
approximation x 2r which corresponds to the abscissa of the point of inter-
section P 2 of the straight line y = g(x x ) with y — /(x). Equation (2-6)
gives J3, which is the abscissa of the point of intersection P3 of the straight
line y = fif(x 2 ) with y = /(x), and so on. The sequence x Xt x 2l ...
obviously converges to xq.
The situation when the slopes of the curves (2-9) satisfy (2-8) but are
opposite in sign is illustrated in Fig. 4. The value x 2 determined by
solving (2-5) is the abscissa of the point of intersection P 2 of y « f(x)
with y — ^(xi). It lies on the opposite side of the root from Xj. The
third approximation x% is the abscissa of the intersection of y * g(x 2 )
with y = /(x), and it lies on the same side as xi but nearer to Xq. In Fig. 3
the approach to the intersection Pq is along a staircase path, while in
Fig. 4 it is along a spiral. In either case, the rapidity of convergence 1
depends on the nature of the functions /(x) and g{x).
1 Some criteria for the speed of convergence are given in Hildebrand, op. cii.
682
NUMERICAL ANALYSIS
[chap. 9
Example 1. Determine the approximate values of the real roots of
e* — 4x m 0. (2-10)
The real roots of this equation are the abscissas of the points of intersection of the
curves y «■ e* and y « 4x shown in Fig. 5.
It appears that the smaller of the roots, xo
lies in the vicinity of x «* 0.3. The larger
root, £o, is close to x ■» 2 1. Since for x * x 0
the slope of y *■ 4x is greater than that of
y ** e*, we write (2-10) in the form
x =» 14e*,
so that in the notation of Eq. (2-4)
f(x) « x and tf(x) « 14«**
The sequence of approximations x n according
to (2-7) is thus determined from
Xn 4 i - »- 1,2,.... (2-11)
If we take x\ ® 0.3, we get 1 from (2-11)
x 2 - he 0 ’ 3 » 14(1*34986) « 0.3374
a* - 14 ^* « 34(1.40130) - 0.3503
X4 « J 4 e^ «= 14(1.41949) - 0.3549
x 6 « « 14(1.42603) * 0.3565
x 6 - 14«* 5 - 14(1.42832) - 0.3571
X1 m 14 ^* - 14(1.42917) « 0.3573.
1 In performing these calculations it is convenient to use tables such as ‘ 'Table* of
Exponential Functions/' National Bureau of Standards, Washington, D.C., 1951.
683
BBC. 2) SOLUTION OF EQUATIONS
If only three-decimal-place accuracy is required, the computations can be terminated at
this stage.
To obtain the second root we note that at x <* $o, the slope of y • 4x is less than that
of y *> e* If we write (2-10) in the form
e* — 4x
or x « log 4x,
so that/(x) « x and g(x) » log 4x, then the condition (2-8) is satisfied at x ■» Bo.
The desired sequence (x n ) is now given by
x n +i - log 4x n , n * 1, 2, . .
and we can take xi *» 2.1.
Using tables of natural logarithms 1 we find
X2 « log 4xi ** log 8.4 *» 2.12823
x 8 «■ log 4x2 *» log 8.5129 » 2.14158
X4 *> log 4 x 8 ** log 8.5663 =* 2.14783
Xff ■ log 4x4 ** log 8.5913 2.15075
x e - log 4x 6 - log 8.6030 ~ 2.15211
x 7 « log 4xe - log 8.6084 - 2.15273
x 8 - log 4x 7 * log 8.6109 - 2.15303
x 9 m log 4x 8 - log 8.6121 * 2.15316.
The value of the root $o, correct to three decimals, is 2.153. We do not give a dis-
cussion of the errors in the approximations obtained by such calculations because a
rigorous analysis of errors in the iterative procedures is fairly involved. 1
Example 2. Find an approximate value of the real root of
near x * 3r/2.
From the graphs of
x — tan x » 0
y — x and y «• tan x
( 2 - 12 )
in Fig. 6, it appears that Eq. (2-12) has just one real root in each of the intervals
(2n — l)w/2 < t < (2n *f l)r/2, where n - 0, ±1, =fc2,
It is convenient to rewrite (2-12) in the form
x ■» tan” 1 x,
so that in the notation (2-4) /(x) «■» x and g{x) — tan*" 1 x. This choice assures that the
condition (2-8) is satisfied at the root xo.
The sequence of approximations this time is given by
Xn+i » tan~ l x n , n m 1, 2, . .
1 For example, “Tables of Natural Logarithms/’ National Bureau of Standards,
Washington, D.C., 1941.
* A brief discussion is contained in Hildebrand, op. dt. } chap. 10.
684 NUMERICAL ANALYSIS [CHAP, fl
On taking t% ■» &r/2 «* 4.7124 radians, we find
x 2 ~ tan" 1 4.7124 » 4.5033
x% •» tan" 1 4.6033 * 4.4938
X\ - tan- 1 4.4938 « 4.4935,
which suggest that the root xo, correct to three decimals, is 4.493.
These examples indicate that if it is possible to write Eq. (2*1) in the form
x « g(z) t
and if | g'(x) | < M < 1 in the interval of length 2 1 z\ — xo | centered at xo, then the
recursion formula giving the desired approximating sequence is
x n +i - gr(xn), n * 1, 2, (2-13)
PROBLEMS
1. Use both methods of this section to obtain, correct to two decimals, the values of
the real roots in Probs. 1 and 2 of Sec, 1.
2. Find in the manner of the examples of this section the real roots of x B — z — 0.2 ® 0
correct to three decimals.
3. Newton’s Method* The successive terms in the approximating se-
quence in the method of false position (see Fig. 2) are determined by the
intersection of the secant line with the x axis. Newton proposed con-
structing an approximating sequence determined by the intersection with
the x axis of the tangent line to the curve y « F(x ).
Thus let the root x = x 0 of
F{x) - 0
(3-1)
SEC, 3} SOLUTION OF EQUATIONS 685
lie in the vicinity of x ® x\ (see Fig, 7). The equation of the tangent
line to y « F(x) at Pi (#1,2/1) is
y - F(x 1) * F'(x 0(# ~ #1). (3-2)
If the curve y = F(x) has the appearance shown in Fig. 7, the tangent
line (3-2) cuts the a; axis at #2, which is a better approximation to the root
than x\. To determine x 2 we set
y =* 0 and find
#2 = #1 ~
F(x 0
F'(#i)
if F'Ctj) 5^ 0. Having determined
x 2 , we find in the same way that the
tangent to y = F(x) at P2[#2,F(x2)]
intersects the axis at
#3 = #2
F(r 2)
>(#2)’
and in general,
#«+i
F(x n )
F'(#n) f
1 , 2 ,
(3-3)
The geometric considerations indicate that when y = F(x) is a mono-
tone increasing or decreasing function in the interval (xi,.r 2 ) [so that
F'U) does not change sign] and when there is no point of inflection in
this intena) [so that F”(x) does not change sign], the sequence (3-3)
converges to the root x 0 .
The situations corresponding to the cases when there is a point of in-
flection or a horizontal tangent to y = F(x) in the vicinity of the root are
illustrated in Figs. 8 and 9. It is clear from these figures that in these
cases the sequence (3-3) need not converge to xq. Thus, before applying
Newton’s method one should examine the behavior of F'(x) and F"(x)
in the vicinity of the root.
686 NUMERICAL ANALYSIS [CHAP. 9
Example; Find the angle subtended at the center of a circle by an arc whose length is
double the length of its chord.
Let the arc BCA (Fig. 10) be of length 2 BA. If the angle subtended by this arc at
the center of the circle is 2x radians, then the arc BCA «® 2xr while BA m 2r sin x f
r being the radius of the circle.
Our problem requires that
2 xr ** 4 r sin x,
or x — 2 sin x » 0. (3-4)
On graphing the functions y «■ x and y * 2 sin x (Fig. 11), we see that they intersect at
x «* 0 and at x » 1.88 radians, approximately. We reject the trivial solution x « 0.
Since y * x — 2 sin x is obviously monotone increasing and has no point of inflection
near the root xo, we can apply formula (3-3) with n = 1.88. Wc find
ii — 2 sin ti
X 2 « x\ ~
1 -- 2 cos Xj
The third approximation is
1.88 - 2 sin 1.88
1 - 2cos 1.88~
Xf
X 2 — 2 sm X 2
1—2 COS X2
1.896.
* 1.896 -
1.896 —J2 sin 1 896
1 - 2 cos L896~~
1.8955,
which is nearly the same as xa. The angle subtended by the arc BCA , as given by this
approximation, is 3.7910 radians.
PROBLEMS
1. Calculate by Newton's method the roots in Examples 1 and 2 of Sec. 2.
2. Solve by Newton’s method Prob. 2, Sec. 1.
$* Find to three decimals by Newton’s method the angle subtended at the center of
a circle by a chord which cuts off a segment whose area is one-fourth that of the circle.
4 . Find by Newton’s method to three decimal places the real roots of the following
equations: (a) x - coex - 0; (6) x -f e* * 0; (c) x 4 ~ x - 1 « 0; (d) x 9 - 25 - 0;
M as 1 — x - 0.2 «• 0.
SOLUTION OF EQUATIONS
687
SEC, 4]
4. Systems of Linear Equations. The Gauss Reduction. No doubt the
reader is familiar with Cramer’s rule for solving systems of n linear equations
in n unknowns by determinants. 1 Although Cramer’s rule is important
in numerous theoretical considerations, it is of questionable practical value
when the given system contains more than two unknowns. Usually it is
easier to obtain solutions by some process of elimination of unknowns.
The simplest practical method for solving systems of linear equations,
based on the idea of elimination, is the Gauss reduction method. Its
several variants form the basis for most techniques used in the solutions
of large systems of equations. 2
The idea of the method is simple. Let it be required to solve a system
of n linear equations
0ii*i + 0 1 2 * 2 H b a ln x n = Ci
021*1 + «22*2 H h a 2 n*n = C 2 (4-1)
0wl*l T* 0n2*2 “4* * * ' "4“ 0nn*n C n
in n unknowns We divide the first equation in (4-1) by on, solve for
a*i, and use the result to eliminate X\ in the other equations. The resulting
system of n — 1 equations in rr 2 , . . x n is treated in the same way.
That is, we divide the first of these equations by the coefficient of x 2 and
use the result to eliminate x 2 from the remaining equations. After con-
tinuing the process n times 3 we obtain an equivalent system
*1 + a l2 x 2 + 013*3 + * • • + 0ln*n —
*2 “4" 023*3 + * * * 4" 02n*n = C 2
*n — 1 "4“ 0n-
n*n — C n — i
*n ~
(4-2)
provided the given system has a unique solution. The substitution x n = c n
in the preceding equation in the set (4-2) yields the value of x„_j, and
by working backward we obtain in succession the values of x n _ 2 , x„_ 3 , . . . ,
*i.
In practice the Gauss reduction can be performed in the manner indi-
cated in the following example.
1 A summary of the properties of determinants and Cramer's rule are given in Ap-
pendix A.
1 Among such variants are the Crout and the Gauss-Jordan reductions. These are
described in Hildebrand, op. cit ., and in many other books.
1 If the coefficient of x r in the rth equation vanishes, it is necessary to renumber the
variables or equations.
NUMERICAL ANALYSIS
[CHAP. 9
Example: Solve the system
2.843xi - 1.326X5 4* 9.841a* - 5.643
8.673x1 4- 1,295*2 - 3.215*8 - 3.124 (4-3)
0.173*1 - 7.724*2 4- 2.832*3 * 1.694
by the method of Gauss' reduction.
On dividing each equation in (4-3) by the coefficients of %\ in that equation, we get
*i - 0.46641*2 4- 3.4615*8 - 1.9849
xi 4- 0.14931*2 - 0.37069*8 * 0.36020 (4-4)
*i - 44.647*2 4- 16.370*8 - 9.7919.
The subtraction of the second equation in (4-4) from the first and the third gives
—0.61572*2 4- 3.8322*3 * 1.6247
—44.796*2 4* 16.741*3 - 9.4317
and, on dividing these by the coefficients of * 2 , we find
*2 - 6.2239*3 = -2.6387
*2 - 0.37372*8 * -0.21055.
(4-5)
Subtracting the second equation from the first in (4-5) yields
-5.8502*3 - -2.4282,
so that *s » 0.41506.
The reduced system consists of the first equations in (4-4) and (4-5) and Eq. (4-6).
*1 - 0.46641*2 4- 3.4615*3 - 1.9849
(4-6)
It is
*2 - 6.2239* 8 * -2.6387 (4-7)
*s « 0.41506.
The substitution of the value of *3 from the last into the second equation of (4-7) gives
*2 * -2.6387 4- 6.2239(0 41506) * -0.055408
and the first reduced equation finally yields
*1 * 1.9849 +0.46641 (-0.055408) - 3.4615(0.41506) - 0.52232.
There are numerous modifications of the procedure just indicated, some of which are
adapted for computations on desk calculators while others are more suitable for high-
speed electronic computers.
PROBLEM
Use Cramer’s rule and also apply the Gauss reduction to solve the following systems;
(a) 2* + y + 3« « 2,
3* — 2 y — z » 1,
* — y + 2 «* — 1 ;
„ (&) 2*i + *2 + 3*g + *4 » —2,
5*1 + 3*a — *8 — *4 1,
*1 — 2*2 + 4*a + 3*4 ** 4,
3*1 ^*2 +*j»2;
SEC. 5] SOLUTION OF EQUATIONS ' 089
(c) LS29»i + L415xi - 2.291X8 - 0.532,
L395xi - 0.531x8 - 1.211,
l.OOlx! + 2.093X8 - 0,556.
5. An Iterative Method for Systems of Linear Equations* Except for
the round-off errors the Gauss reduction method explained in the pre-
ceding section is exact. When the determinant of the system (4-1) is
different from zero, it yields the desired solution after a finite number of
steps. However, successive steps leading to an equivalent triangular
system (4-2) may prove laborious and ill-adapted to machine calculations.
For this reason, a variety of iterative methods, which in theory require
an infinite number of steps to obtain an exact solution, have been devised.
One of these methods, due 1 to L. Seidel, is based on the use of the
iterative formula (2-13). The convergence of any iterative method ob-
viously depends on the character of the system under consideration.
In many cases the system (4-1) can be rewritten so that in the zth
equation the coefficient an of the unknown x» is numerically large compared
with other coefficients. That is to say, the coefficients along the diagonal
of the system (4-1) dominate the other coefficients. In this event by solving
the fth equation for x» we can rewrite such a system (4-1) in the form
1
X\ = (Ci ~ aj2^2 — U 13 X 3 a Jn X n ),
an
1
X 2 s= (c 2 ~ U2iXi ~ ^23^3 — • * * — a2 n X n )j (5-1)
a22
x n = (c n a*iXi a n2 x 2 * * * a n , n __ix n __j).
a n n
If we set Xi =* X 2 — • * • = x n ** 0 in the right-hand members of (5-1), we
obtain
X<‘> - * = 1, 2, (5-2)
«u
which is called the first approximation to the solution of (5-1).
The substitution of this first approximation in the right-hand members
of (5-1) yields the second approximation x (2 \ and so on. The cycle is
then repeated with the expectation that the values x[ k) after the kth iteration
are not substantially altered by further iterations. 2
1 Generally called the Gauss-Seidel method.
•There are several criteria for convergence of this process which generally are not
easy to verify. It is known that when the coefficients in (4-1) are symmetric (so that
<H } m aj t ) and the matrix (an) is positive definite, the Seidel process always converges.
See Hildebrand, op. dt, , for a brief discussion of several criteria.
NUMERICAL ANALYSIS
690
(chap. 9
In practice the iteration process described above is usually modified by
taking as the first approximation xi 1 * the value of xi obtained from the
first equation in (5-1) by setting x 2 — x 3 * • * * » x n * 0. Using this
value in the second equation in place of x x and setting x 3 ** x 4 =*•••» x n
*# 0, one obtains the approximation x ( 2 l \ To obtain one inserts for
x% and x$ the values x \ l) and xjj 1 * in the third equation and sets x 4 « xg —
. . . == x n * 0. Finally, to get the value of x one uses previously found
values x[ l \ . . ., x\~ x in the last equation of the system (5-1). This process
is repeated to obtain approximations of higher orders.
This particular choice of approximations usually improves the rapidity
of convergence of the process. We illustrate it by an example.
Example: The system (4-3) can be rewritten in the form
8.673xi + 1.295x2 - 3.215x 3 - 3.124
0.173xi - 7.724x2 4* 2.832x g - 1-694 (5-3)
2.843xi - 1.326X2 4- 9.841x a - 5.643
in which the diagonal coefficients dominate.
We next write (5-3) in the form (5-1) and get
*1 - (3.124 - 1.295^2 + 3.215*,)
o.o73
x, - - - 1 - (1.694 - 0.173x, - 2.832x,) (5-4)
mmm i ♦ /Z4
Xt - - (5.643 - 2.843X! 4- 1.326x2).
9.841
To obtain xi n we set x% ■» x 3 =* 0 in the first equation in (5-4) and find
Xi 1 * - - 0.36020.
8.673
Inserting this value for xi and setting x 3 « 0 in the second equation in (5-4), we get
xi l) - -0.21125.
Finally, xj 1} *» 0.44089 is obtained by using the values xS 1} and in place of xi
and X8 in the third of Eqs. (5-4).
A repetition of the process yields second, third, and higher approximations. These
are recorded in the table:
k
1
3
4
5
6
7
4*>
0.36020
0.56517
0.51780
0.52312
0.52220
0.52236
0.52233
*?>
-0.21125
-0.04523
-0.05852
-0.05501
-0.05550
-0.05543
-0.05544
0.44089
0.40694
0.41594
0.41488
0.41508
0.41505
0.41505
SBC- 6 ] INTERPOLATION AND EMPIRICAL FORMULAS ' 691
A comparison with the values found in Sec, 4 by the Gauss reduction method shows
that in this problem six iterations were necessary to get four-decimal accuracy.
INTERPOLATION. EMPIRICAL FORMULAS. LEAST SQUARES
6. Differences. One of the problems connected with the analysis of
experimental data concerns the representation of such data by analytic
formulas. Thus, we may wish to represent, either exactly or approxi-
mately, a set of observed values (x»,y t ) by some relationship of the form
y *» f(x). In such analysis the concept of differences is important.
We consider a set of pairs of values (; Xi t y % ), where i * 0, 1, . . n, which
can be represented by points in the xy plane. The differences between
successive pairs of ordinates yi+i and y> we call the first forward differences
of the y& and we denote them by Ay*. Thus,
= Ft+i - Vi, i - 0, 1, 2, , . ., n. (6-1)
The second forward differences are defined by
A 2 yi « kyi+i - A yi
and, in general, the kth forward differences are
A k y t = A h ~ l yi+i - A h ~ l yi. (6-2)
These differences are usually represented in a tabular form:
Table 1
692
NUMERICAL ANALYSIS
[CHAP. 9
in which the quantities in each column represent the differences between
the quantities in the preceding column. These are usually placed midway
between the quantities being subtracted, so that the forward differences
with like subscripts lie along the diagonals indicated in the table by arrows.
We note that if the rth differences A T y t are constant, then all differences
of order higher than r are zero. 1
Now, it follows from (6-1) and (6-2) that
Vi m Vo + &Vo
y* * y% + Ayi « (Vo + &Vo) 4- (A 2 y 0 + Ay 0 ) « yo + 2A y 0 + A 2 y 0
V$ ** V% + &V 2 ~ (Vo 4* 2Ayo 4* & 2 yo) 4- (A 2 yi 4* AyO
5=5 (yo 4“ 2Ayo 4~ A 2 yo) 4* (A 3 yo 4- A 2 yo 4- A 2 yo 4“ Ayo)
J53 yo 4- 3 Ayo + 3A 2 yo 4- A 3 yo-
These results can be written symbolically as
Vi = (1 4- &)yo, 2/2 - (1 + &) 2 Vo, 2/3 8=8 (1 4- A) 3 yo
in which (1 4~ &) k is an operator on y 0 with the exponent on the A indicat-
ing the order of the difference. The difference operator A is analogous to
the differential operator D introduced in Chap. 1.
We easily establish by induction that
y k - (1 + A) k y 0t *- 1,2,..., (6-3)
or, in the expanded form,
k(k - 1) , k(k - l)(fc - 2) .
Vk - Vo + k Aj/o -4 — A 2 j/ 0 4 — A 3 y 0 4 . (6-4)
Formula (6*4) enables us to represent every value yk in terms of y 0 and
the forward differences A y 0 , A 2 y 0>
We can derive a similar formula by starting with the values of the yB
at the end of Table 1 and forming the backward differences defined as
follows: The first backward differences Vyi are
Vyi * Vz - 1 * (6-5)
The second backward differences V 2 y x are defined by
V 2 y, = Vy x - ( 6 - 6 )
and in general, the kth backward differences V k y % are
vV = (6-7)
1 A differences table in a specific numerical example appears in the Example of the
next section.
SEC, 81 INTERPOLATION AND EMPIRICAL FORMULAS * 693
A table of backward differences is indicated in Table 2, where the dif-
ferences V k yi with a fixed subscript i lie along the diagonals slanting up,
as shown by arrows.
Table 2
Now, from (6-5) to (6-7) we deduce that
Vn ~ ^ Vn ^ Vn—l ~ Vn tyri-l H" Vn—2
V z Vn = v 2 y n - v 2 y n - 1 * Vn - 32/ n ~i + 3t/„-2 - y n ~3
and in general k
V k y n = V fc_1 J/r. ~ = 23 ( _1 ) r ( ) Vn—ri
r«*0 \r/
k(fc - l)(fc - 2) ... (fc - r + 1)
where
0
(6-8)
(6-9)
is the binomial coefficient of x r in the expansion of (1 + x) k ,
By using (6-8) successively in the definitions of backward differences
we find
2/n~ 1 * Vn ~ Vyn s (1 “ V)Vn,
y n -~2 " Vn 2 Vy n + V 2 2 / n “ (1 “ y)*Vni
and, in general, ^ - (1 - V)*y«, (6-10)
where V is the backward-difference operator. The formula (6-10) when
NUMERICAL ANALYSIS
[chap. 9
expandedreads
Vn-k “ Vn - kvy n +
k(k - 1 )
2!
V 2 ?/„
*(* - l)(Jfc - 2)
3!
( 6 - 11 )
It shows that any value of y in Table 2 can be expressed in terms of y n
and backward differences V h y n .
We shall use formulas (6-4) and (6-11) to derive certain interpolation
formulas and to deduce some formulas for numerical integration.
PROBLEMS
1. Compute the forward and backward differences for the following set of data;
X
l
2
3
4
5
6
7
8
y
2.106
2.808
3.614
4.604
5.857
7.451
9.467
11.985
2. Write expressions for the yu, k ** 1, 2, . . ., in Prob. 1 by using (6-4) and (6-11).
7. Polynomial Representation of Data. Unless a statement to the con-
trary is made, we shall suppose henceforth that the values x x in- a given
set of data (x,-,^*), where % ** 0, 1, 2, . . n, are equally spaced. If the
spacing interval is h, then
X\ = Xq “f* h } X 2 — Xq -j- 2 h, . . . , X n s* Xq -f~ nh.
We pose the problem of representing the data by some formula y = f(x),
which for x = xq + kh yields yk — f(x 0 + kh). We shall frequently write
fk for y k .
We observed in the preceding section that whenever the rth differences
of the ys are constant, then all differences of order higher than r vanish.
In this event formula (6-4) yields
Vk « 2/o +
where the binomial coefficients
are defined by
(7-1)
0 k(k - l)(k - 2) r + 1)
r!
(7-2)
Since the x x are spaced h units apart,
x k =* Xq + kh, k « 1, 2, . . n,
x k — Xq
so that k * — — (7-3)
h
Now the expression (7-2) is a polynomial of degree r in k. Therefore, on
f
8EC. 7] INTERPOLATION AND EMPIRICAL FORMULAS 695
substituting in (7-1) for k from (7-3) we obtain a polynomial of degree r
in Xk . When like powers of Xk are collected, (7-1) takes the form
Vk *= <*0 + a\Xk + a 2 xl 4 b d r xf k . (7-4)
Accordingly, the polynomial in x,
y(x) « oo + aix + a^ 2 4 f 0 ^, (7-5)
assumes the values 2 /* when we set x « x*. Thus, when the rth differences
of the pk are- constant and the Xk are equally spaced, the polynomial (7-5)
represents these data exactly.
It is easy to prove a converse to the effect that the rth differences of
the polynomial (7-5) are constant. It would suffice to show that the
first difference Ay(x) = y(x + h) — y(x) formed with the aid of (7-5)
is a polynomial of degree r — 1, for if differencing a polynomial once re-
duces its degree by 1, r successive differencings would yield a polynomial
of degree 0, that is, a constant. 1
When rth differences in a given set of data are not constant but differ
from one another by negligible amounts, the polynomial (7-5) represents
the data approximately.
Example: The set of data and the forward differences tabulated below suggest that
these data can be represented by a cubic polynomial y — ao -f aix + a*x 2 -f- a&x* if
two-decimal accuracy is sufficient.
1 We leave it to the reader to show that Ay(x) is, indeed, a polynomial of degree r — 1.
The result is analogous to the theorem that the derivative of a polynomial of degree r
is a polynomial of degree r — 1. The expression Ay = y{x -f h) — y{x) save for the
factor 1/A is the difference quotient used in defining the derivative.
896 NUMERICAL ANALYSIS [CHAP. 9
The coefficients a in this polynomial can be determined with the aid of formula (7-4)
by using (7-1) with r » 3 and by taking
l/o m 2.105, Aj/o « 0.703, A Vo ” 0.103, A 8 j/o m 0.081.
Since such calculations present no interest, we do not give them here. It is more sensible
to determine the o» by the method of least squares of Sec. 11.
PROBLEMS
1. Given the table:
X
19 1
20
21
22
23
24
25
y
81.00 j
90.25
100.00
110.25
121.00
132.25
144.00
Compute second forward differences, and represent the data by y ** ao -f aix 4* o^x 2 .
Determine ao, «i, 02 so that the polynomial passes through (a) the first three points,
(b) the last three points.
2. Discuss the calculation of the 1 /* in Prob. I from (6*4) and (6-11).
8. Newton’s Interpolation Formulas. When the data {x i ,y l ) } where
i 0, 1, 2, . . ft, are presented in tabular form, an infinite numl>er of
analytic relations y =*= f(x) can be devised such that iji = /(x») either
exactly or approximately. Once a suitable form of /(. r) is determined, the
formula y = f(x) can be used to calculate the ordinates y for xs not ap-*
pearing in the table. That is, the formula can be used for interpolation
or extrapolation.
The simplest of such formulas is a linear relationship based on the
assumption that the values of y in the interval (x t -,x t+l ) can be represented
by
V = Vi + — - — — (x - Xi). (8-1)
Formula (8-1) is precisely that used in estimating the values of such tabu-
lated functions as logarithms by the process of “interpolation by pro-
portional parts.”
More accurate interpolation formulas are based on the assumption that
the desired value of y can be computed from a polynomial
y * a 0 + a x x + a 2 x 2 4 f a m x m (8-2)
in which m + 1 coefficients a»* arc so chosen that m + 1 pairs of tabulated
values (Xi,y x ) satisfy (8-2) exactly. 1
In the preceding section we saw that when the data are represented by a
1 These m + 1 pairs may include the entire set of given values (x*,y»), or they may
be a subset so chosen that \x - Xi | is as small as possible.
t
SEC. 8] INTERPOLATION AND EMPIRICAL FORMULAS 607
polynomial of degree m, then all forward differences of order higher than
m vanish. Accordingly, formula (6-4) yields
Vk
m - 1) _
Vo + k A^/o 4 r: A^yo + *
/ 1
• +
k(k — 1) . ♦ . (ft — m + 1)
m\
& m yo
(M)
and, since the x, are equally spaced, x k — x 0 + kh, so that
On inserting this value of k in (8-3) we get
x k - x o (x k
V* = 2/o H ; Aj/o 4
TqXt* - r 0 - h)
2\h 2
A 2 2/o + ■
+
(xk ~ r 0 )(Xk - x» - ft) ■ . ■ Or* - Jo - mh -f h)
_____
(8-4)
This relation is satisfied by m + 1 pairs of the tabulated values. If
we assume that the value of y corresponding to an arbitrary x can be
obtained from (8-4) by replacing t* by x, we get the formula
vM = ?/o +
Aj/o +
(j — r 0 )(x - x 0 — ft)
Jih 2
A 2 y 0 H
+
(t - T 0 )(.r — x 0 — h) ... (x - T 0 - mJi + A)
»7w”
A M J/o
(8-5)
known as Newton' s forward-d iff ere nee interpolation formula. This formula
can, of course, he used for either interpolation or extrapolation.
By replacing (jc — Jto)/h by a dimensionless variable A" which represents
the distance of x from x 0 in units of h, we get from (8-5)
Vx
Vo + X Ay 0 +
X(X - 1)
2 !
A 2 ?/o 4
, X(X- 1) ... (X-m+1)
+ : A m y 0 ,
ml
(8-6)
where X = (.r — or 0 )//i and = y(x 0 + hX) — j/(x).
A similar calculation based on the use of (6-11) yields Newton's backward-
difference interpolation formula
X (X + 1)
Vn+x Vn + X Vy n H ~ V 2 y n + *
, X(X + 1) . . . (X + m - 1) _
+ : V> n (8-7)
m!
NUMERICAL ANALYSIS
[chap. 9
where
so that x = x n + XX,
and Vn+x = y(x n + hX ) = y(x).
When the data cannot be represented by a polynomial, the right-hand
members of (8-6) and (8-7) are infinite series involving differences of all
orders.
Formulas (8-6) and (8-7) can be used to compute derivatives of tabu-
lated functions. Thus, on differentiating successively (8-5) with respect
to x and setting x — x 0 in the result, we get
i / , , n l 5 . \
y"{x o) - — 2 ^A 2 !/o - A 3 y 0 + — A 4 j/o “ - A 5 2 /o H J
V"\xo) = ( A *»o ~ ^ A 4 t/ 0 + ^ A 5 t / 0 )
( 8 - 8 )
V W (x 0 ) = — (A 4 ?/ 0 - 2A 4 j/o 4 )•
Formulas (8-8) should be used with caution because even when y = f(x)
is well represented by the polynomial P(x) f the derivatives of f(x) may
differ significantly from those of P(x).
Example : Using the data given in the Example in Sec. 7, determine an approximate
value for the y corresponding to x * 2.2.
First, let y be determined by using only the two neighboring observed values (hence,
m » 1). Then, xq * 2, yo *■ 2.808, Ayo ** 0.806, and X «* (2.2 — 2)/l » 0.2. Hence,
y « 2.808 -f 0.2(0.806) * 2.969,
which has been reduced to three decimal places because the observed data are not given
more accurately. This is simply a straight-line interpolation by proportional parte.
If the three nearest values are chosen, m » 2, ro ~ 1, yo 88 2.106, Ayo 0.703,
A*y 0 » 0.103, and X « 2.2 - 1 - 1.2. Then,
y - 2.105 + 1.2(0.703) + ( ( 0 .103) - 2.961,
correct to three decimal places.
If the four nearest values are chosen, m «* 3, xq « 1, j/o • 2.105, Aye •» 0.703,
A*yo ** 0.103, A*yo ■» 0.081, and X ** 1.2. Therefore,
v - 2.105 + 1.2(0.703) + — - (0.103) + (1 ' 2 ^ 0 ' 2)( °' 8) (0.081) - 2.958,
2 6
correct to three decimal places.
f
SEC. 9] INTERPOLATION AND EMPIRICAL FORMULAS 699
PROBLEMS
1. Compute with the aid of formulas (8*6) and (8-7) the approximate values of y
corresponding to x «* 6.6 from the data of the Example in Sec. 7. Use two and three
neighboring values.
2. Extrapolate the value of y for x » 8.2 from the data in the Example of Sec. 7 with
the aid of (a) formula (8-6), (b) formula (8-7). Use m ** 2.
3. Compute j/'(l) and y”( 1) from the data of the Example of Sec. 7 with the aid of
( 8 - 8 ).
9. Lagrange’s Interpolation Formula. The interpolation formulas de-
veloped in the preceding section apply only when the given set of x x is
an arithmetic progression. If this is not the case, some other type of
formula must be applied.
As in Sec. 8, select the m + 1 pairs of observed values for which \x — x t \
is as small as possible, and denote them by (x x ,y t ) where i « 0, 1, 2, . . . , ra.
Let the rath-degree polynomials Pk(x) y where k ~ 0, 1, 2, . . ., ra, be de-
fined by
(x - x 0 )(x - Xi) ... (x- x m )
x - X k
IK*- *»)•
Then, the coefficients Ak of the equation
t r*k
y =
E A k Pk(x)
0
can be determined so that this equation is satisfied by each of the ra - f 1
pairs of observed values (r„y t ). For if £ = Xk, then
since P*(r t )
Ak
Vk
Pk(x k )
0 if i 5 ^ k. Therefore,
A VkPk(x)
v = y.
to Pk(Xk)
(9-2)
is the equation of the rath-degree polynomial wdiich passes through the
ra + 1 points whose coordinates are (x xy y x ). If x is chosen as any value
in the range of the x ly (9-2) determines an approximate value for the
corresponding y.
Equation (9-2) is known as Ixigrange’s interpolation formula , Ob-
viously, it can be applied when the x t are in arithmetic progression but
(8-5) is preferable in that it requires less tedious calculation. Since only
one rath-degree polynomial can be passed through m + 1 distinct points,
it follows that (8-5), or its equivalent (8-6), and (9-2) are merely different
forms of the same equation and will furnish the same value for y.
700
Example: Using the data
NUMERICAL ANALYSIS
{chap. 9
V
10
15
22.5
33.75
50.625
75.987
p
0.300
0.675
1.519
3.417
7.689
17.300
apply Lagrange's formula to find the value of p corresponding to v — 21.
II the two neighboring pairs of observed values are chosen so that m ** 1,
V
« 0.675
21 - 22.5
15 - 22.5
+ 1.519
21 - 15
22.5 - 15
1.350,
correct to three decimal places.
If the three nearest values are chosen so that m ■* 2,
rto {21 - 15)(21 - 22.5) , (21 - 10)(21 - 22.5)
V * u.o r 0.075
V (10 - X 5) ( 10 - 22.5) (15 - 10)(!5 - 22.5)
correct to three decitn&l places.
+ 1.519
(21 - 10)(21 - 15)
(22J- 10K22.5 - 15)
1.323,
PROBLEMS
1. Using the data of the Example in Sec. 9, find an approximate value for p when
v m 30. Use rn » 1 and m » 2.
2. Use m ■ 1, 2, and 3 in formula (8-6) to find an approximate value of 0 when
t « 2.3, given
t
0
1 i
2
3 |
4
5
6
7
8
&
60.00
51.66
44.46
38.28
32.94
28.32
24.42
21.06
18.06
3 . Given the data
X
0.16
0.4
1.0
2.5
6.25
15.625
V
2
2.210
2.421
2.661
2.929
3.222
find an approximate value of y corresponding to x « 2. Use formula (9-2) with m » 1
and mm 2.
4 . Given the data
C
19
20
21
22
23
24
25
H
81.00
90.25
100.00
110.25
121.00
132.25
144.00
find an approximate value of H when C » 21.6. Use formulas (8-6) and (8-7) with
m «* 1, 2, and 3,
SEC. 101 INTERPOLATION AND EMPIRICAL FORMULAS 701
10, Empirical Formulas. A given set of discrete data can be represented
analytically in infinitely many ways. Such analytic representations are
called empirical formulas , and the choice of the functional form for an
empirical formula ordinarily depends on the use to be made of the formula.
Thus, if a given set of data is to be represented by a function f(x) which
enters in the differential equation
Liu) -/(*),
the form of fix) may well depend on the ease with which this equation can
be solved. For some types of differential operators L it may be wise to
take f{x) as an algebraic polynomial, in others as an exponential, and so
on. Because of the commonness of algebraic and trigonometric poly-
nomials in applications, we confine our discussion of empirical formulas
primarily to these two types.
The first step usually taken by an experimenter in appraising a set of
observed values is to plot them on some coordinate paper and draw
a curve through the plotted points. If the points when plotted
on a rectangular coordinate paper, lie approximately on a straight line,
he assumes that the equation y — mx + b represents the relationship.
To determine the constants m and b, the slope and the y intercept may be
read off the graph or they may be calculated by solving two linear equations
for m and b got by substituting the coordinates of two judiciously chosen
points on y ~ mx 4* b.
If the plot of points on a logarithmic coordinate paper indicates that they
lie on a straight line, the desired relationship has the form
y = ax m }
for on taking logarithms, we get
log y * log a + m log x y
and if coordinate axes X, Y are marked so that log y « Y and log x = X,
we get a linear equation
Y = log a + mX.
Again the constants a and m can be either read off the graph or computed
by solving a pair of linear equations for m and log a.
Similarly, the data can be represented by an exponential function
y « alO m *
if the values (x^yi) when plotted on a semiiogarithmic paper fall on a
straight line, for on taking logarithms to the base 10, we get
log y » log a + mx ,
which is linear in log y and x .
702
NUMERICAL ANALYSIS
[CHAP. 9
When none of these simple functional relationships fits the data, one
may determine, with the aid of Sec. 7, if the data can be fitted by a poly-
nomial. It should be stressed, however, that ordinarily the choice of an
empirical formula is governed by whatever uses are to be made of it. Once
a formula is chosen, the parameters entering in it (such as the coefficients
in the polynomial representation) can be determined by imposing some
criterion for the goodness of fit of the data by the chosen function. The
method of least squares, presented in the next section, provides one of the
most commonly used of such criteria.
PROBLEMS
1. Plot the following data on a rectangular, logarithmic, or semilogarithmic paper to
determine the approximate functional relationships between y and x.
X
3
4
5
6
7
8
9
10
11
12
y
X
5
! 1
1 5
j
.6
1 2
6
M
6.4
3
7
4
7.5
5
8.2
6
8.6
7
9
8
9.5
j 9
y
2.5
3.5 J
4.3
5
5.6
6.2
6.6
7.1
7.5
X
l
2
3
4
5
6
7
8
y
0.5
0.8
1.2
1.9
3 1
4.8
7.5
11.9
2 . Verify that the data in Probs. 2, 3, and 4 in Sec. 9 may be approximated by the
following types of functions: 0 — a 10"**, y « ax m , H « ao -f -f C 2 , respectively.
Determine the parameters graphically or analytically.
11. The Method of Least Squares. We saw in Sec. 7 that the m + 1
coefficients in the polynomial
2/ ~ ao + aix H h a m x m (11-1)
can always be determined so that a given set of m + 1 points (
where the xs are unequal, lies on the curve (1 1-1). When the are equally
spaced, the desired polynomial is determined by the formula (8-5) and,
in the more general case, by (9-2).
When the number of points is large, the degree m of the polynomial
(11-1) is high, and an attempt to represent the data exactly by (11-1)
not only is laborious but may be foolish, for the experimental data in-
variably contain observational errors and it may be more sensible to rep-
resent the data approximately by some function y * f(x) which contains
a few unknown parameters. These parameters can then be determined
so that the curve y « f(x) fits the data in “the best possible way.” The
INTERPOLATION AND EMPIRICAL FORMULAS
703
are, of course,
sec. 11]
criteria as to what constitutes “the best possible way 1
arbitrary.
For example, we may attempt to
fit the set of plotted points in Fig. 12
by the straight line
y « a x + a&
and choose the parameters a\ and a 2
so that the sum of the squares of
the vertical deviations of the plotted
points from this line is as small as
possible.
More generally, if we choose to
represent a set of data (x v ,y x ), where
t =^* 1, 2, . . ., n, by some relationship y — f(x ), containing r unknown
parameters a t , a 2 , . .., a r , and form the deviations (or the residuals , as
they are also called) ... m
3 (H-2)
the sum of the squares of the deviations
vt = T, m*i) - yH 2
t-i
(11-3)
is clearly a function of a% f a 2 , . . a r . We can then determine the as so
that S is a minimum.
Now, if *S(aj,a 2 f . . ,a r ) is a minimum, then at the point in question
as
as
— = 0,
—
a«i
da 2
dS
da r
(11-4)
The set of r equations (11-4), called normal equations , serves to determine
the r unknown as in y ~ f{x). This particular criterion of the “best fit”
of data is known as the principle of least squares, and the method of de-
termining the unknown parameters with its aid is called the method of least
squares . It was introduced and fully developed by Gauss 1 when he was
a youth of seventeen!
We indicate the construction of the normal equations first by supposing
that y = f(x) is a linear function
y » a x + a&. (11-6)
1 The criterion of least squares plays a fundamental role in the approximation of a
suitably restricted function f(x) by a linear combination of orthogonal functions. As is
shown in Chap. 2, Sec, 23, the partial sums of Fourier series give the best fit in the sense
of least squares. It should be noted, however, that the polynomials giving the best
fit to f(x) in the sense of least squares, in general, are not the partial sums of Maclaurin’s
or Taylor’s series for/(x).
NUMERICAL ANALYSIS
(chap. 9
704
The residuals (11-2) for (11-6) are
Vi * (ai + a a xi) - yu
so that S =* 2 v <
i- 1
= (ai + 02X1 — y{f + (ai + 02X2 — 3/2)*
+ • • • + (oi + «2X n — y„) 2 .
On differentiating S with respect to a x and a 2 , we deduce two equations:
— = 2 (a! + 02X1 - Vi ) + 2 (aj + a 2 x 2 - 2/2)
ckii
-1 h2(o! + a 2 x„ - jf„) = 0,
SS
— = 2x!(a! + « 2 X 1 - j/0 + 2x 2 (a 1 + < 12 X 2 - y 2 )
da-t
-j h 2x„(o, + a 2 x„ - 2 /„) = 0.
If we divide out the factor 2 and collect the coefficients of a x and o 2 , we get
nai + ( E x i ) «2 = £ Vi,
(n-6)
( E X,') flj + { E *i) <*2 = E *#.'•
\t=*i / / i»»i
These equations can be easily solved for ai and a 2 .
Exavi-ple 1. We illustrate the use of Eqs. (11-6) by calculating the coefficients in
y » a\ *f a& to fit the following data:
X
i
2
3 |
4
y
1.7
1.8
2.3
3.2
In this case n «* 4, and since
4
Ex. - 1 +2 + 3 + 4 - 10,
1-1
4
Es«“ 1.7 + 1.8 + 2.3 + 3.2 -9,
»«*1
4
E A - 1 + 4 + 9 + 16 - 30,
1-1
4
E “ 1-7 + 2(1.8) + 3(2.3) + 4(3.2) - 25,
tml
705
BBC. 11] INTERPOLATION AND EMPIRICAL FORMULAS
the system (11-6) reads
4ai -f lOaj « 9,
10a i -f 30a 2 « 25.
Solving for ai and a% we get ai « 1, as « l A> so that the desired straight line fitting the
data in the sense of least squares is y « 1 4* Ax.
We suppose next that y = f(x) is a polynomial
y ~ ai + a 2 x + a 3 x 2 H f- a r x r ~ l
= X)
The residuals Vi this time are
Vi = 2 <*,*{ 1 - Vi.
(11-7)
(11-8)
Since
s = £)»?,
t-1
Eqs. (11-4) can be written as
From (1 1-8),
as A
” 2 23 r ““ 0i k «* 1, 2, . . , , r.
aajk ,»i oak
— = r*- 1
dajk
(11-9)
so that, on dividing out the factor 2, we can write the normal equations
(11-9) as
n
£ e,*? -1 = 0. (11-10)
l-l
The substitution from (11-8) in (11-10) yields
- Vi) rf" 1 = o,
•«*i i /
and on collecting the coefficients of the a ; -, we get a set of r linear equations
£ (£ 4 +k ~ 2 } ay = £ *f'V. fc = 1, 2, . . r, (11-11)
J— 1 V—l / i-l
for A| f ®2j • * *y ftr*
We illustrate the use of these equations by two examples.
705 NUMERICAL ANALYSIS [CHAP, 9
Example 2, Let the data in Example 1 be fitted by y * ai 4* <*& 4* a& l < Then
Vi m ai 4* flax* 4~ - Vi
and
The normal equations
Bv t dVi dv »
»* 1 «* Xi, — «■* xf.
da\ dot das
£><— -0, jt-1, 2, 3,
S3 do*
53 (°i + °**» + - j/») *1 *0,
53 (oi -f W *f osaf - Vi)x t - 0,
*— l
53 (°i + <*&* 4- smf - P.)4 - o.
4-1
If the coefficients of the a, are collected and the normal equations put in the form
(11-11), one obtains the three equations
** + (,?,*■)
at + ^53 *<) 03
4
~ Ei+
»-l
(S*‘)" + (S 1 »)
as 4- (l3 x ?) as
4
* 13 X W,
(S'*)*' + (S-0
«2 4* (53 art) «3
- E A vi-
i-i
X>
<- 1
«14-24-34-4*10,
53 A * 1 4-
»-i
4+9 + 16
4
53 *<3/*
• 1.7 4*3.6 4“ 6.9 4*12
.8 ** 25, etc.
Now,
t—i
The equations become
4ai -f IO 02 -I- 30aa - 9,
10ai + 3002 -f 100a 3 » 25,
30oi + 100a* 4- 354a 8 * 80.8;
and the solutions are a\ ** 2, 02 *■ —0.5, oj » 0.2.
Example 8. Let us apply the method of least squares to fit the data
X
1
2
3
4
5
6
7
8
V
2.105
2.808
3.614
4.604
5.857
7.451
9.467
11.985
by the polynomial y - ai + 4- 0 * 2 * 4- «43*.
SEC. 11 } INTERPOLATION AND EMPIRICAL FORMULAS 707
In this case n «* 8 and Eqa. (11*11) yield four normal equations obtained by setting
k — 1, 2, 3, 4. They are
8o! + ( E *<) “s + ( E *? ) a > + ( E *?)«*“ JL Vi,
(Ex,) oi + (Z 4 ) or + (Ex?) O, + (gx?) «« -
(Ex?) Ol + (Ex?) o, H- (E x{) O, -1- (Ez?) 04 - Ex?«i,
(Ex?) 01 + (Ex?) 02 + (Ex?) oj + (Ex?) 04 - E4w-
From the form of the coefficients of the a*, it is seen that it is convenient to make a
table of the powers of the x % and to form the sums and y % before attempting to
write down the equations in explicit form.
MS
x?
3
Xi
4
Xi
1
1
1
1
1
1
2
4
8
16
32
64
3
9
27
81
243
729
4
16
64
256
1,024
4,096
5
25
125
625
3,125
15,625
6
36
216
1,296
7,776
46,656
7
49
343
2,401
16,807
117,649
8
64
512
4,096
32,768
262,144
Sxi 36
204
1,296
8,772
61,776
446,964
fl
Vi
x<Vi
Ay,
r—
4vi
1 i
2.105
mm
2.105
2.105
2
2.808
11.232
22.464
3
3.614
10.842
32.526
97.578
4
18.416
73.664
294.656
5
5.857
29.285
146.425
732.125
6
7.451
44.706
268.236
1,609.416
7
9.467
66.269
463.883
3,247.181
8
11,985
95.880
767.040
6,136.320
Zx\y x
47,891
1,765.111
12,141.845
70S NUMERICAL ANALYSIS [CHAP, 9
When the values given in the tables are inserted, the normal equations become
80 ! + 360* + 204oa + l ( 296<n - 47.891,
36di + 204oj + l,296o 3 + 8,772en - 273.119,
204aj + 1, 29602 + 8,772 a, + 61,776 a« - 1,766.111,
l,296oi + 8,772os + 61,776o s + 446, 964a* - 12,141.846.
The solutions are
<H - 1.426, as - 0.693, o 8 * -0.028, 04 - 0.013.
Therefore, the equation, as determined by the method of least squares, is
y « 1.426 + 0.693x - 0.028s 1 2 + 0.01 3**
The normal equations ( 11 - 11 ), corresponding to the polynomial repre-
sentation of data, are linear in the coefficients a % . They need not be linear
in the unknown parameters if the function y = f(x) is not a polynomial
in x. In this event the solution of the system (11-4) may prove difficult,
and one may be obliged to seek an approximate solution by replacing the
exact residuals ( 11 - 2 ) by approximate residuals which are linear in the
unknowns. This is accomplished by expanding y — f(x), treated as a
function of a X) a 2 , . a r , in Taylor’s series in terms of a x ~ d x ss Aa t ,
where the are approximate values of the a The values of a % may be
obtained by graphical means or by solving any r of the equations y t — }{x x ).
The expansion gives
y « f(x , a u . . a r ) » f(x, d x + A a u . . d r + Aa r )
where
' df
1 /(^j #1> • * • y &r) + 2 ~ A a k
k -1 V a k
1 ^ d 2 /
+ ~ 2^ Aa ) Aa * H — 7
2 !/,*«! ddjddk
( 11 - 12 )
df df
— s —
ddfc dajc
d 2 f
d 2 f
ddj d&)c ddj dak
etc.
Assuming that the d t are chosen so that the A a, are small, the terms of
degree higher than the first can be neglected and ( 11 - 12 ) becomes
y ” f(x,a lf . . .,Or) +
E —
Jfc«i ddk
Aa k .
The n observation equations are then replaced by the n approximate
, equations
df
SAC* II] INTERPOXtATION AND EMPIRICAL FORMULAS 700
If (11-13) is used, the residuals Vi will be linear in the Aa*, and hence the
resulting conditions, which become
&S
— — - = 0, k - 1, 2, . . r, (11-14)
d(Aa k )
also will be linear in the Aa *. Equations (11-14) are called the normal
equations in this case.
We illustrate the use of Eqs. (11-14) in Example 4.
Example 4. We seek to determine the constants k and a in the formula 0 * ka*
chosen to represent the following data:
t
l
2
3
4
e
51.60
44.46
38.28
32.94
The determination of A: and a in this problem can be reduced to the solution of two
linear equations, for if we write B «* ka * in the form
e - mo*
then on taking logarithms to the base 10, we get
log B ® log k 4* bt.
Setting log $ «* y and log k ** K, we get
y ** K + bt (11-15)
which is linear in K and b. These constants can be determined by the procedure de-
scribed above, which leads to the solution of a pair of linear equations. 1
To illustrate the use of formulas (11-12) to (11-14), we follow a more laborious route
which gives an approximation to the original equation.
When the values recorded in the table are plotted on semilogarithmic paper, it is
found that A; — 60 and a ** 10~ 0 068 «* 0.86, approximately. This suggests using k$ » 60
and oo * 0.9 as the first approximations. The first two terms of the expansion in
Taylor’s series in terms of Ak =* A; — 60 and Aa « a — 0.9 are
B - 60(0.9) 4 + o Afc + Aa
\dK/ a—0 9 \oa/a«-0.9
- 60(0.9)* 4- (0.9)* AA: + 60*(0.9)*~ l Aa.
If the values (UA) are substituted in this equation, four equations result, namely,
0i - 60(0.9)** + (0.9)** Ak + 60<.(0.9)**“ 1 Aa, * - 1, 2, 3, 4.
The problem of obtaining from these four equations the values of AA; and Aa, which
furnish the desired values of is precisely the same as in the case in which the original
equation is linear in its constants. The residual equations are
Vi - (0.9)** Ak + 60tj(0.9)**~ 1 Aa + 60(0.9)** - 9i, i - 1, 2, 3, 4.
1 However, the approximation obtained by this meanfl does not give an approximation
to the original equation in the sense of least squares.
710
Therefore
NUMERICAL ANALYSIS
$ » E - Z 1 ( 0 . 9 ) '• A k + 60 « 0 . 9 )'<~ l Aa + 60 ( 0 . 9 )*< - #< J*
»—l t— 1
[chap. 9
and the normal equations
become
and
AS
d(A k)
- 0 and
AS
d(Aa)
• 0
2 £ [0.9‘* A* -f WCO.Q)^- 1 Aa + 60(0.9)** - - 0
i-X
2 X) [0.9 f * A/e + 60^(0 S) 1 *- 1 A a + 60(0.9)^ - ftjeO^O^) 1 - 1 « 0.
*-x
When these equations are written in the form
p A/e + q Aa - r,
with all common factors divided out, they are
and
E ( 0 . 9 )"< AJb + 60 E {.(O.O) 11 *- 1 Ao - E #i( 0 . 8 )** - 00 E ( 0 . 9 )*‘<
**•1 t««l i»l
4 4 4 4
E t<( 0 . 9)**<— 1 A* + 60 E f?( 0 . 9 ) 5i -~ ! AO - E #rf>( 0 . 9 ) ,,_l - 60 E ‘.(O.O) 1 **- 1 .
«*»1 *»-l *X»1 *—l
As in Example 3, the coefficients are computed most conveniently by the use of a table.
mm
a
2
3
4
Totals
(0.9) '<
0.9
0.81
0.729
0.6561
(0.9)*“
0.6561
0 531441
0.43046721
2.42800821
tj(0.9)*“ -1
1
1.458
3.77147
1.9131876
6.0426576
tf(0.9)’**-‘
1
3.24
5.9049
8.503056
18.647956
(«.)(0.9)“
46.494
36.0126
27.90612
21.611934
132.024654
mm)*- 1
51.66
80.028
93.0204
96.05304
320.76144
Substituting the values of the sums from the table gives
and
2.42800821 A k + 362.659456 Aa - 132.024654 - 145.6804926
6.0426S76 A k + 1,118.87736 Aa - 320.76144 - 362.559466.
BBC. 12] INTERPOLATION AND EMPIRICAL FORMULAS 711
Reducing all the numbers to four decimal places gives the following equations to solve
for Ak and Aa:
2.4280 Ak + 362.5595 A a - -13.6558,
6.0427 Ak + 1,118.8774 Aa - -41.7980.
The solutions are
Ak m —0.238 and Aa « -0.036.
Hence, the required equation is
6 « 59.762(0.864)*.
PROBLEMS
1. Apply the method of least squares to find the constants in y *« ai 4* <* 2 # 4* ajjx 2
to fit the data
X
l
2
3
4
5
6
y
3.13
3 76
6.94
12.62
20.86
31.53
2. Determine by the method of least squares the constants a and n in p « av n to
fit the following data by writing the equation in the form
log p » n log v 4 log a.
V
10
15
22.5
33.7
1 50.6
75.9
V
0.300
0 675
1.519
3.417
7.689
17.300
Hint: Set log p « y y log v ~ x , and determine the constants in the resulting linear
equation.
8. Compare the result of Example 4 with the calculation of the constants in (11-15).
12. Harmonic Analysis. The problem of representing a suitable periodic
function in a trigonometric series was considered in some detail in Chap.
2. In this section we give a brief discussion of the problem of fitting a
finite trigonometric sum to a set of observed values Let the set
of observed values
ip^OiV o)> (*£l>2/l)> • • •> (X2n—li2/2n— 1)> (% 2n)V2n)t
be such that the values of y start repeating with y 2n (that is, y 2n « yo ,
y 2n +i = Vu etc.). It will be assumed that the x t are equally spaced, that
x 0 = 0, and that x 2n = 2*. [If x 0 ^ 0 and the period is c instead of
2w, the variable can be changed by setting
2w
==: (x* Xo)*
c
The discussion would then be carried through for and j/,- in place of the
T12 NUMERICAL ANALYSIS
X{ and Vi used below.] Under these assumptions
[chap. 0
Xi **
VK
n
The trigonometric polynomial
« n — 1
y «* A 0 + A* cos kx + 2 s* n kx (12-1)
Jfc-1 Jk-4
contains the 2n unknown constants
A(); ^1) A-2i • • •> A n , Bl) &2, • • •> -Bn-1,
which can be determined so that (12-1) will pass through the 2 n given
points (x*,y,) by solving the 2n simultaneous equations
n n~l
yi * A 0 + £ A* 008 kx < + Yj B k sin kx if i * 0, 1, 2, . . 2n - 1.
1 *«*1
Since $,* «* tir/n, these equations become
" ikr ihv
V% » A 0 + Za cos b 2-, sin — »
k-~i n jk„i n
i - 0, 1, 2, 2n - 1. (12-2)
Hie solution of Eqs. (12-2) is much simplified by means of a scheme
somewhat similar to that used in determining the Fourier coefficients,
Multiplying both sides of each equation by the coefficient of A 0 (that is,
by unity) and adding the results give
* Z 2 ^ 1 *r\ V? Z 2 ^ 1 ik *\
£ Vi *= 2ftA 0 + ]C ( cos — ) A k + ( E sm — ) B k .
0 &-»l ' l=»0 ^ > kwa> 1 ' »««0
n /
It can be established that (cf. Example 1, Sec. 17, Chap. 2)
and
Therefore,
2 n — 1
ikir
E
cos —
= o ,
t ~0
ft
2 ft — 1
ikir
r
sin —
= 0 ,
i «*0
ft
2n — ]
2n A o ~ 2 2/*’
i « 1,2, ...,»
A: = 1, 2, — 1
(12-3
Multiplying both sides of each equation in (12-2) by the coefficient of /.
713
SBC. 12] INTERPOLATION AND EMPIRICAL FORMULAS
in it, and adding the results, give®
•JK 1 yV " Z 2 ^ 1 ik* ijr\
2^ Vi cos — « 2_ ( 2L> cos — cos — 14*
i—0 W A:«»«l \ ta»0 ^ W /
-- 1 Z 2 ^ 1 . *r $w\
+ 2* l 2^ sm — cos — 1 Bk
. . , _ . Jb— i \ *~o n n/
for j * 1, 2, . * n — 1. But
2 ^ 1 ikv ijir
2j COS COS = 0, if fc 5*i j,
*—0
n
n
if k = j,
and
2 ” 1 . ifcir ijv
2^ sm — cos — = 0
t~ o n n
for all values of k. Therefore,
2n- 1
nAy = 23 2/* cos
.-o n
^7T
j = 1, 2, .. n - 1.
(12-4)
To determine the coefficient of A n the procedure is precisely the same,
but
2 lZ l ikir
> , cos — cos zV »: 0, if k n,
*-o n
= 2n, if & = n.
Hence,
2w-1
2nA n * 23 2/» cos **•
»=»o
(12-5)
Similarly, on multiplying both sides of each equation of (12-2) by the
coefficient of Bk in it and adding, one finds that
2 1^ 1 ijir
nBj = 2 j/, sin — . j = 1, 2, . . ., n - 1. (12-6)
.-o n
Equations (12-3) to (12-6) give the constants in (12-1). A compact
schematic arrangement is often used to simplify the labor of evaluating
these constants. It will be illustrated in the so-called “6-ordinate” case,
that is, when 2n = 6. The method is based on the equations that deter-
mine the constants, together with relations such as
ir (n — 1)t (n + 1 ) 7T (2 n — 1)tt
sin - ® sin * — sin = — sin »
n n n n
ir
cos - »
n
(n — 1 )tt
— cos
n
— cos -
(n + I)**
cos
(2n — l)ir
n
714 NUMERICAL ANALYSIS [CHAP. 9
Six-ordinate Scheme . Here, 2n * 6; the given points are where
x, » mt/ 3 (i * 0, 1, 2, 3, 4, 5) ; and Eq. (12-1) becomes
y « Aq + ili cos x + A 2 cos 2x + A 3 cos 3x + Bi sin x + B% sin 2x.
Make the following table of definitions:
yo y\ y 2
Vo Vi
V)o Wi
Vz Vi 2/6
H
Wi
Sum . . Vo Vi
Vo Pi
ro n
Difference (iz?o w%
Qi
*1
It can be checked easily that Eqs. (12-3) to (12-6), with n = 3, become
&4o - Vo + Pu SAi * r 0 + ^s 1? 3^4 2 = Po —
VS VS
6^3 * r 0 - s h ZB\ « — r b 3£ 2 = — <?i-
2 2
Example: In particular, suppose that the given points are
X
1
0
7T
3
2 tt
3
V
4r
3
Sir
3~
2ir
y
1.0
1.4
1.9
1 7
1.5
1.2
1.0
Upon using these values of y in the table of definitions above,
1.0
1.4
1.9
1.7
1.5
1.2
2.7
Vi -
2.9
t *2 ** 3. 1
- 0.7
*
- 0.1
u * * 0.7
2.7
2.9
- 0.7
- 0.1
3.1
0.7
po « 2 7
Pi M
?i -
6.0
-0 2
ro - — 0.7
n * 0.6
*1 * — 0.8
Therefore, the equations determining the values of the constants are
64 0 * 2.7 + 6.0 -
8.7
and
a 0 -
1.45,
3A] ■» —0.7 —04 =
-1 1
and
A 1 -
-0.37,
3 ^ 2 « 2.7 - 3.0 *
-0.3
and
At —
-0.10,
6A 3 * -0.7 + 0.8 -
0.1
and
A, -
0.02,
Vi
3Bi - — (0.6) -
jS
0.3V3
and
B, -
0.17,
Vi
3Bj — - 0.2) -
m
-0.1 V3
and
Bj “
-0.06.
SEC. 13] NUMERICAL INTEGRATION OF DIFFERENTIAL EQUATIONS 715
Hence, the curve of type (12-1) that fits the given data is
y — 1.45 — 0.37 cos x — 0.10 cos 2x + 0.02 cos 3x 4* 0.17 sin x ~ 0.06 sin 2x.
A convenient check upon the computations is furnished by the relations
Vi
Ao 4- + -^2 4* ^4.8 *■ t/o and #i 4- B 2 ** (yi — Fs)»
Substituting the values found above in the left-hand members gives
1.45 - 0.37 - 0.10 4 0.02 - 1.0 and 0.17 - 0.06 - 0.11,
which check with the values of the right-hand members.
Similar tables can be const ructed for 8-ordinates, 12-ordinates, etc.
NUMERICAL INTEGRATION OF DIFFERENTIAL EQUATIONS
13. Numerical Integration. ^ The reader is familiar with the interpreta-
tion of the definite integral / /(a) dx as the area under the curve y = /(a)
j a
between the ordinates x « a and x « b. This interpretation underlies the
construction of formulas for numerical integration contained in this sec-
tion.
It will be recalled that if the function f(x) is such that its indefinite
integral can be obtained, then the fundamental theorem of integral calculus
provides an easy means for evaluating the definite integral. 1 However
when /(a*) does not have an indefinite integral expressible in terms of known
functions, or w r hen the values of /(a) are given in tabular form, formulas
for numerical integration are generally used to obtain an approximate
value of the integral.
Formulas for numerical integration, or mechanical quadrature , are ob-
tained by replacing the function /(a) specified at a given number of points
in the interval (a,b) by a polynomial (8-5) or (9-2), depending on whether
the values of a are equally or unequally spaced.
If the values of y - /(a) are known at m 4 1 points x x , where i = 0,
1, 2, . . . , m, which are spaced h units apart, an approximate value of the
integral / m /(a) dx can be computed by substituting in the integrand an
Jzo
approximate polynomial representation of y = /(a) given by (8-5) or,
equivalently, (8-6). We thus get for equally spaced values a*
£y dX = JT [vo + X Ay 0 +
H h
X(X - 1) ... (X - m + 1)
to !
dX,
(13-1)
1 Bee Chap. 3, Sec. 13. The evaluation of difficult integrals by power series is discussed
to Chap, 2, Sec. 10.
{chap. 9
716 NUMERICAL ANALYSIS
where X is the dimensionless variable defined by
x — x 0
and X = to for
X m = x 0 + mh.
If to * 1, formula (13-1) yields
(13-2)
(13-3)
Aj/o Vi - Vo 1
J Q VdX **f o (y 0 +X A Vo) dX * y 0 + = J/o + 2
" (ifo + yi).
But from (13-2) dX « rfar/A, and on recalling (13-3), we see that this
formula can be written as
/ l ydx = ~ ( 2/0 + Vi). (13-4)
-'xo 2
Since yo is the ordinate of y = /(x) at x = x 0 and 2 / x is the ordinate at
x » Xj, the right-hand member in (13-4) represents the area of the first
tr&peaoid sliown in Fig. 13. The choice of m = 1 in the calculations
leading to (13-4) corresponds to replacing y = f(x) in the interval (x 0 ,x 1 )
by the straight line through ( 20 , 2 / 0 ) anc ^ ( x ifVi)-
The successive application of (13-4) to intervals (xi,x 2 ), ( 22 , 23 ), .
(x n -i,x n ) yields
L* y dx
B [ \dx + f *ydz+'
Jxq Jx 1
■+r v dx
Jxn-\
** - (2/0 + 2/1) + - (yi + 2/2) H t- - (2/n — 1 4 - y«)
& Z 2
a ,
— - (l/o + 2yi + 2 j/ 2 + • • • + 2y n -\ + y n )-
(13-6)
SEC. IS] NUMERICAL INTEGRATION OF DIFFERENTIAL EQUATIONS 717
Formula (13-6) is known as the trapezoidal rule, for it gives the value of
the sum of the areas of the n trapezoids whose bases are the ordinates
Vot VuVat • • * » Vn- Figure 13 shows the six trapezoids in the case of n = 6.
If m - 2, (13-1) becomes
r a f 2 f (X 2 - X) 0 1
j^ydX = jf 1 2 /o + A 2 y 0 \ dX
1 /8 \„
= 2y 0 + 2 Ay 0 + - (- - 2 ) A 2 y 0
1
= 2y 0 + 2(yi - y 0 ) + ~ (ya - 2y l + Vo)
O
1 4 1
= - Vo + ~Vi + ~Va,
r*t h
or / y dx * - (y 0 + 4y x + y 2 ). (13-6)
■ / aro 3
Suppose that there are n + 1 pairs of given values, where n is even. If
these n + 1 pairs are divided into the groups of three pairs with abscissas
# 2 t> # 2 »+ 2 > where i = 0, 1, . . . , (n — 2)/2, then (13-6) can be applied
to each group. Hence,
[ x n [ x i f x 4 f x n
/ y dx ~ I ydx+ y dx d b / y dx
•''*0 ^xa Jx n ~i
h h
* - («/o + 42/1 + 2/2) + - (l /2 + 4j/3 + 2 / 4 )
u u
H b ~ iVn —2 + 4y n -i + 2/n)
u
h
“ - [j/O + 2/n + 4(2/1 +2/3-1 1- 2/n-l)
o
+ 2(y 2 + 2/4 H — * + Vn- 2 )]- (13-7)
Formula (13-7) is known as Simpson's rule with m =® 2. Interpreted
geometrically, it gives the value of the sum of the areas under the second-
degree parabolas that have been passed through the points ($ 2 ^ 2 %) ,
(*w+idto+i)> (*ai+ 2 ,ya<+a)» where i « 0, 1, 2, • • « > (n — 2)/2.
718
NUMERICAL ANALYSIS
{char. 9
If m ** 3, (13-1) states that
X s - X , X s - 3X 2 + 2X , \ „
J 0 VdX - J o yy 0 + X A y 0 H — A 2 j/ 0 -I Al/oJ dX
9 (9 9\ /27 9 3\ ,
= 3j/o + - Aj/o +(--J A J/o + (- - - + -j A j/o
9 9
“ 3j/o + - (Vx - 2/o) + 7 ( 2/2 - 2yi + j/ 0 )
2 4
3
+ ~ (Vs — 3^/2 + 32/i — 2/o)
o
3
=*“(2/0 + 3t/i + 32/2 + 2/3),
o
r* , 3/i
or / y dx « — (2/0 + 3z/i + 32/2 + 2/3). (13-8)
•^o 8
If n + 1 pairs of values are given, and if n is a multiple of 3, then (13-8)
can be applied successively to groups of four pairs of values to give
£ n 3 h
y dx ~ [ 2/0 + y n + 3(2/1 + V2 + V\ + 2/5 4 + Vn—2 + Vn- 1 )
8
+ 2(2/3 + 2/e H 1- Vn- a)]- (13-9)
Formula (13-9) is called Simpson's rule with m = 3. It is not en-
countered so frequently as (13-5) or (13-7). Other formulas for numerical
integration can be derived by setting m = 4, 5, ... in (13-1), but the three
given here are sufficient for ordinary purposes. In most cases, better
results are obtained by securing a large number of observed or computed
values, so that h will be small, and using (13-5) or (13-7).
Example 1. Using the data given in the Example of Sec. 7, find an approximate value for
J^ydx.
Using the trapezoidal rule (13-5) gives
J^ydx - HC2.105 4~ 5.616 4- 7.228 -f 9.208 -f 11.714 4- 14.902 + 9.467) » 30.120.
Using (13-7) gives
J\dx - H12.105 -b 9.467 + 4(2.808 4- 4.604 4- 7.451) 4- 2(3.614 4* 5.857)] « 29.989.
Using (13-9) gives
f ydx - H\2.m 4 * 9.467 4- 3(2.808 -f 3.614 4- 5.857 4- 7.451) 4- 2(4.604)] - 29.989.
13] NUMERICAL INTEGRATION OF OlFFERENTIAL EQUATIONS 719
If numerical integration is to be used in a problem in which the form of
f(x) is known, the set of values (xi,y t ) can usually be chosen so that the
Xi form an arithmetic progression and one of the formulas deduced above
can be applied. Even if it is expedient to choose values closer together
for some parts of the range than for other parts, these formulas can be
applied successively, with appropriate values of h , to those sets of values
for which the x t form an arithmetic progression. However, if the set of
given values was obtained by observation, it is frequently convenient to
use a formula that does not require that the x x form an arithmetic progres-
sion.
Suppose that a set of pairs of observed values ( Xi t y x ), where i = 0, 1,
2, . . . , m, is given. The points (x iy y x ) all lie on the curve whose equation
is given by (9-2). The area under this curve between x ~ xo and x * x m
is an approximation to the value of j y dx. The area under the curve
J x o
(9-2) is
/*" y dx — £ ~~ / " Pk(x) dx, (13-10)
Jx ° Pk(Xk) Jxt
in which the expressions for the P*(x) are given by (9-1).
If m « 1, (9-1) and (13-10) give
f * y dx - — — — [*' ( x — x x ) dx 4 — — f* 1 (x — rr 0 ) dx
J x Q X q Xi •'*0 X\ ~ Xq
X\ — Xq
= — (2/o + 2/i). (13-11)
A
Formula (13-11) is identical with (13-4), as would be expected, but the
formula corresponding to (13-5) is
/ ydx = l A[(xi - X 0 )(y 0 + Vi) + (*2 - Xi)(yi -f y 2 )
J x 0
H h (x„ - x n _,)(y n _! + y n )]. (13-12)
If m = 2, (13-10) becomes
720
NUMERICAL ANALYSIS
[chap. 9
Vo r j - Xq _ (*l + X 2 )(x 2 - Jo)
Po(*q) L 3 2
+ XiX2(%2
-*#)]
vi r
M*i) L
x\ - xl (x 0 + X 2 ) (xl - To)
Va \x\~xl (x 0 + x,)(xf - Xq)
P*{*a) L
(*2 ~ *o) 2 f Vo
+ XoX 2 (xa
+ XoXi(x 2
-*.)]
~ *<>)]
— - (3x 1 - 2x 0 - x 2 ) + (x 0 - x 2 )
(x 0 ) Pl(Xl)
+ (2x 2 + Xo - 3xj) • (13-13)
PaM J
Formula (13-13) reduces to (13-6) when X\ — £ 0 ~ x 2 — 'Xi = h . The
formula that corresponds to (13-7) is too Jong and complicated to be of
practical importance, and hence it is omitted here. It is simpler to apply
(13-13) successively to groups of three values and then add the results.
Example 2. Using the data given in Prob. 3, Sec. 9, find an approximate value of
r .2ft
ydx .
it
Using (13-12) determines
r 0.26
/ ydx - ^(0.24(4.210) + 0.6(4.631) + 1.5(5.082) + 3.75(5.590)] - 16.187.
JO At
Applying (13-13) successively to the first three values and to the last three values gives
r M dx m (0,84 )* r 2 ( L2 ~ °' 32 ~ 2 - 21 °(-°- 84 ) 2.421(2 +0.16 - 1.2) *1
H V “el (~0.24)(— 0.84) + (0.24)( — 0 6) + (0.84)(0.6) ~J
(0.84)(0.6)
(5.25) 2 ["2.421(7.5 - 2 - 6.25) 2 601 (-5.25)
(-1.5)(-5.25)
(1.5)(— 3.75)
2.929(12.5 + 1 - 7.5)"
+ (5.25) (3.75)
PROBLEMS
Determine the values of ydx by applying (13-5) and (13-7) to the following data :
xl 234567
y 2.157 3.519 4.198 4.539 4.708 4.792 4.835
SBC. 14 ] NUMERICAL INTEGRATION OF DIFFERENTIAL EQUATIONS 721
1 . 50.645
2. Apply formula (13-12) to compute / p dv from the data of the Example in Sec. 9.
h o
3. Work the preceding problem by applying (13-13).
>25
4. Apply formulas (13-5) and (13-7) to compute / H dC from the data of Prob. 4,
Sec. 9. . 19
5. Find approximate values of j V4 4- x 3 dx by applying formulas (13-5) and (13-7)
with Xm *® tn, m » 0, 1, 2, . . 6.
14. Euler's Polygonal Curves. The methods available for the exact
solution of differential equations, as we noted in Chap. 1, apply only to a
few, principally linear, types of differential equations. Many equations
arising in applications are not solvable by such methods, and one is obliged
to devise techniques for the determination of approximate solutions.
We begin with the consideration of the first-order equation
V' = f(x,y) ( 14 - 1 )
and seek its solution y = y{x) taking on a prescribed value yo = y(x 0 ) at
x « z 0 .
At each point of the region where f(x,y) is continuous Eq. (14-1) deter-
mines the slope of the integral curve passing through that point. The
equation of the tangent line at the point (x 0> y Q ) to the integral curve
V = v(x) is
y - y 0 = f(x 0} y 0 )(x - So). (14-2)
If we advance along this line a short distance to a point (21,2/1), we can
compute from (14-1) the value y\x x ) = f(x X) y x ) which, in general, will
not be equal to the slope of y = y(x) at x = x lt because the point (xi,^)
ordinarily will not lie on the integral curve y = y(x). But if (x x ,y x ) is
close to (xo,yo), the slope of the integral curve at x = 2! will not differ
much from f(x X) y x ). To put it differently, the linear function (14-2)
approximates the solution of (14-1) in the neighborhood of the point
(ar 0 ,t/o)* (^ ee Fig. 1 i n Chap. 1, Sec. 1.)
We consider next the straight line through (x x ,y x ) with the slope f(x Xl y x )
and proceed along it a small distance to a point (22,2/2). At (22,2/2) we
draw another straight line with the slope /( 22,^2) an( * advance along it to
a point (23,2/3). By continuing this construction we obtain a polygon
consisting of short straight-line segments joining the points (20,^0), (21,2/1),
(22,2/2), •••, (2 n ,$/n)* The polygonal curve so obtained is called Euler’s
polygon . This polygonal curve can be expected to approximate the in-
tegral curve reasonably well when the points (x*,2/ t ) are not too far apart
and the end point (x n ,Vn) is not too far away from (20,2/0).
The end points of the segments forming Euler’s polygon clearly satisfy
722 NUMERICAL ANALYSIS [CHAP, 9
[of, (14*2))
Vi ~ Vo * /(®o,Jfo)(*i - *o)
1/2 - 2/1 = f(x\,V\)(x2 ~ X\) (14-3)
|/n - 2/n— 1 = /(^n-l,2/n~l)(^n - «n-l)
and if each interval x % — x x _>i is of length h } we can write (14-3) as
2/m+l 5=5 2/m + f(Xm,ym)h, ™ = 0, 1 , 2, . . . , U - L (14-4)
The recursion formula (14-4) enables us to compute successively the
approximate values of the ordinates of the integral curve y = y{x) at
Xk * Xo + kh, where A; * 1, 2, . . . , n. It may suffice for rough calculations
if the spacing interval h is small and m not too large,
A more accurate formula can be obtained by constructing, instead ^f
the chain of rectilinear segments, a chain made up of parabolic segments.
Thus, we can draw through (xi,yi) a parabola
y «■ ao + a x (x — x{) 4- a 2 (x — x x ) 2 (14-5)
which at x = Xo has the slope /(xo,2/o) and at x = x 3 the slope f{x\ } yi).
A simple calculation of the constants in (14-5) yields
2/2*2 /i + {y'(xx) + l A[y'(x x ) - y'(x 0 )]}h. (14-6)
This formula serves to determine y 2 if 2/i =» y(x i), y'(xi), and the difference
Vy[ ss y'(x x ) — y'(xo) are known. Now, if we suppose that the solution
y(x) can be represented by Taylor’s formula
y(z) <= i/o + y'(xo){ x - x 0 ) + Hy"(xo)(x - x 0 ) 2 H h R„, (14-7)
we can calculate the needed quantities in the right-hand member of (14-6).
The coefficients in (14-7) can be calculated from (14-1) whenever f(x,y)
has a sufficient number of partial derivatives, for on setting (x 0 ,2/o) in
(14-1), we get j/'(xo) «/(xo,2/o). Differentiating (14-1) with respect to
x yields 1
y”{x) ~ f x (x t y) + /„(*, y)y'(x) 9 (14-8)
and substituting x * Xo, y - 2/o in (14-8) gives
J/"(*o) * fx(x 0l y 0 ) + f v (xo,y 0 )y'(x Q ).
By differentiating (14-8), we obtain t/"'(x), and so on. The value of R n
in (14-7) in general caimot be computed, but by neglecting it we get an
approximate value of y(x ).
Once the coefficients in (14-7) are determined, we use (14-7) to compute
1 We use the subscript notation for partial derivatives introduced in Sec. 2, Chap. 3*
SEC. 15] NUMERICAL INTEGRATION OF DIFFERENTIAL EQUATIONS 723
y(x i) = yi. The value of y'(x i) is then determined by (14-1), since y'(z x )
* f(x lt yx). The substitution in (14-6) then yields y 2 .
Having computed y 2} we can advance another step and compute t/ 3
from [cf. (14-6)]
yz = V* + \y'{x 2 ) + YzWixi) - v'(^i))} A-
This requires calculating y f (x 2 ) from (14-1).
The general recursion formula, based on the parabolic approximation, is
2/m 4-1 “ 2/m + Iv'&m) + Vl Vy'(x m )]/l, (14-9)
where Vy'(x m ) « y'(x m ) - y'(x m _i).
More elaborate recursion formulas can be constructed by using poly-
nomials of higher degree instead of (14-5). Such formulas lie at the basis
of the Adams method of integration of differential equations discussed in
the next section.
PROBLEMS
1. Construct a polygonal approximation, in the interval ( — 1,1), to the solution of
y* «■ %xy which is such that y( 0) * 1. Take the spacing interval h * 0.2. Also obtain
the exact solution, and plot it on the same sheet of paper.
2 . Determine the coefficients in (14-5), and thus deduce (14-6).
S. Use the equation in Prob. 1 to illustrate the calculation of y<i from formula (14-6).
Also obtain y % . Take xo *= 0, yo * 1, a*i * 0.2, x% * 0.4.
16. The Adams Method. We extend the considerations of the preceding
section by developing a step-by-step procedure for computing an approxi-
mate solution of
y f = f(z,y) (15-1)
taking on a prescribed value yo at x = To. The ordinates y my approximat-
ing the ordinates of the integral curve y = y{x) at z — z TO , will be de-
termined for equally spaced values of z, so that x m = Zo + hm f where
m = 0, 1, 2, Thus our approximate solution will appear in a tabu-
lated form for a discrete set of values of z.
By the Fundamental Theorem of Integral Calculus,
r~ + ' y’(x) dx = f n+ ' (y\ dx = y(x m + h) - y(x m )
J *m Jx m \dx/
bo that Vf+i = Vm + [ " + * y'(x) dx, (16-2)
* xm
where m y(x m + h) and y m « y(x m ).
Now, if the variable x in the integral of (15-2) is replaced by
* = x m + hX (15-3)
724 NUMERICAL ANALYSIS [CHAP 9
where Jf is a new dimensionless variable, (16-2) becomes 1
y m +i - y m + h j[‘ y'(x m + hX) dX. (16-4)
But we saw in Sec. 8 that when a function y'(x) is approximated by a
polynomial of degree n taking on the values y m , y„-i, . . .,y m -„ at x = x m ,
x n -t, x m _„, then [cf. (8-7)]
, , X(X + 1 ) , ,
tfizn + hX) = y' m + X Vy’ n + — — < -v*y m + ---
« !
X(X + 1) . . . (X + n - 1)
+ — — r -vVm- (15-5)
If we insert (15-5) in (15-4) and carry out simple integrations, we find that
Vm +1 ~y m + h(y m + H + V\i Vy m + H ^y m + 25 K 2 o
+ ■■■+ a n V n y' m ), (15-0)
where
a n = /
Jo
i X(X + 1) . . . (X + n - 1)
n\
dX,
(15-7)
Formula (15-6) enables us to compute the ordinate y m +\ if we know
y m , y and the backward differences V k y' m . When the Vy m vanish, (15-6)
reduces to (14-4), and when the V 2 y' m vanish, we get (14-9). As was the
case with (14-9), the values of y m , y m and the V k y m in (15-6) are not avail-
able to us at the start. They must be computed by some means before
(15-6) can be used to evaluate y M +\. The number of the V k y m depends on
the degree n of the polynomial chosen to approximate y'(x). Once we
agree on the value of n, we can compute y m , y m and the requisite number
of the V k y'm with the aid of Taylor’s representation of the solution y — y(x),
as was done in Sec. 14.
We illustrate the procedure in detail in the following example.
Example : Use Adams’ method to obtain, in the interval (0,1), an ap-
proximate solution of
y’ * y + x, (15-8)
taking on the value y$ * 1 at x = 0.
Let us subdivide the interval (0,1) into subintervals of length h * 0.1,
so that
Xk *= %o + kh ~ 0.1/c, k = 0, 1, 2, . . 10.
Furthermore, let us agree to retain in (15-6) the differences of y m up to and
including those of order 3. This corresponds to approximating y f {x) in
(15-2) by a polynomial of degree 3.
1 By (1M) dx**h dX , and at the limits x » Xm and x • Xm -f h, the values of X
are X * 0 and X « 1.
SBC. 15] NUMERICAL INTEGRATION OF UIFFERENTIAL EQUATIONS 725
To compute y m +i from (15-6) we need y m y m , Vy m V 2 y m) and V z y m . The
calculation of the third differences V^y f m requires at least four values
y'mi Vm~i) Vm- 2? Vm^h as obvious from the following table.
Vm~Z
Vm-2
VVm-2
V 2 Vm-l
Vi/m-1
V»y-
y'm-i
V 2 y' m
Vm
If we determine y' 0 , y\, y 2 , 2/3, we shall be in a position to fill in the values
in this table with m — 3 and then proceed to determine y\ from (15-6).
Since ?/o = 1 for rr 0 - 0, Eq. (15-8) yields
Vo = 1- (15*9)
To compute y j, y 2 , and 2/3 we use Taylor’s series
n ftt
y(r) = 2/o + 2/0(2; - x 0 ) + ^ (x - x 0 ) 2 + ~ (x - x 0 ) 3 + • • • (15-10)
with 2/0 — 1 and .r 0 — 0. The coefficients in (15-10) can be calculated
irom (15-8). Differentiating (15-8), we get
y'(j) = y'(x) + 1, (15-11)
and on setting x — 0 and recalling that 2/'(0) =2/0=1, we get y"(0) * 2.
Successive differentiations of (15-11) give
y"\x) = y" (x), ?/ v (x) = 2/"'(*), . . J/ (n) (x) = y^fc),
(15-12)
and since 2/"(0) = 2, we get from (15-12)
y"'(0) = 2, 2/ IV (0) - 2,
Accordingly, (15-10) becomes
x 3 x* T S
2-1 I |
y(x) = 1 + i + r H b
3 3-4 3-4-5
Setting x = 0.1, we get
, ( 0 . 1) 3 ( 0 . 1) 4 ( 0 . 1) 6
2/i = 1 + 0.1 + (0.1) 2 + + i— + —— +
o o ’ 4 o 1 4 * 0
In the same way using x = 0.2 and x = 0.3, we obtain
l/a - 1.2428, y 3 - 1.3997.
2/ (n) (0) = 2.
+ ••••
1.1103.
NUMERICAL ANALYSIS
726
The desired values of y\, y 2) and y$
We find that
[chap, 9
can now be computed from (15*8).
Vi - Vi + xi « 1.1103 + 0,1 « 1.2103
^2 ** V2 + x% =» 1.2428 + 0.2 = 1.4428
Vz 385 Vz + x z “ 1.3997 + 0.3 » 1.6997.
We can now proceed to construct the table of differences shown below.
X
y
y f
W
vV
vy
0
1.0000
1.0000
0.2103
0.1
1.1103
1.2103
0.0222
0.2325
0.0022
0.2
1,2428
1.4428
0.0244
0.2569
| 0 0026
0.3
1 3997
1.6997
0.0270
i
0.2839
0.4
1.5836
1.9830
0.5
1.7974
The substitution from this table in (15-6), with m = 3 and n = 3, yields
y A - 1.3997 + 0.1(1.6997 + H(0.2569) + Ha(0.0244) + %(0.0022)]
« 1.5836.
This value is recorded in the table for x = 0.4.
To compute we must extend the table, since formula (15-6) requires
the knowledge of y\ and assorted differences of y\. By (15-8)
y\ * y 4 + * 1.5836 + 0.4 = 1.9836.
The calculated values (recorded below the heavy line in the table) can
now be used in (15-6), with m = 4, n = 3, to compute y 6 . We have
y 5 - 1.5836 + 0.1(1.9836 + M(0.2839) + ^2(0.0270) + %(0.0026)J
* 1.7974.
This value is recorded in the table for x — 0.5.
We leave it to the reader to make further extensions in the table re-
quired for the calculation of y yr, . . . , yio-
PROBLEMS
1. Complete the table in the Example of Sec. 15 by computing yt> y 7, ...» y\Q.
2. Since (15-8) is a linear equation, its solution satisfying the condition y(0) — 1
EEC. 16] NUMERICAL INTEGRATION OF DIFFERENTIAL EQUATIONS 727
is easily found to be y * 2e* — x — 1. Compare the exact and the approximate values
Vh V* " * Vw-
3. Apply the Adams method to obtain an approximate solution of y* « y with y(Q)
« 1. Use h — 0.1, and compute y( 0.3), jy(0.4), y( 0.5), and ?/(0.6) from (16-6) with n «* 2.
Compare with the exact solution.
4. Use k « 0.1 and (15-6) with n « 3 to find an approximate value of y(~0.6) for
the integral curve of y’ « x 2 4* y 1 through (—1,0).
16, Equations of Higher Order. Systems of Equations. The methods
of Secs. 14 and 15 can be extended to obtain numerical solutions of equa-
tions of higher order. Thus, the second-order equation
v" = f(z,v>y') (16-1)
with initial conditions
y(* o) = Vo, y'{x o) = y 0 (16-2)
can be written as a system of two equations of first order by setting
y' = z. (16-3)
The substitution in (16-1) from (16-3) then yields the second equation
2 ' =/(*,!/, 2 ). (16-4)
In indicating the extension we shall consider, instead of the system (16-3)
and (16-4), a more general system
y' = fi(x,y,z),
2 ' = f 2 ^,V,z),
(16-5)
with initial conditions
y{x Q ) = 1 / 0 , 2 (x 0 ) = z 0 . (16-6)
When solutions of the system (16-5) can be expanded in Taylor’s series
y(x) = y(x 0 ) + y'(x 0 )(x - x 0 ) + ----- (x - x 0 ) 2 + •
Z ;
z(x) = z(j q) + z'(x 0 ) ( x - * 0 ) + (x - x 0 ) 2 + •
(16-7)
the coefficients in (16-7) can be computed by differentiating Eqs. (16-5)
successively as was done 1 in Secs. 14 arid 15.
The construction of Euler’s polygonal approximation also follows the
pattern of Sec. 14. Thus, the equation of the straight line through (xo, Vo,Zq)
tangent to the integral curve of the system (16-5) is 2
V - V o * /i(*o,2/o,Zo)(x - x 0 ),
2 - So * - Xo).
(16-8)
1 Sec in this connection Sec. 6, Chap. 3.
1 The integral curve of the system (16-5) is, in general, a space curve, so that the
tangent line to it is determined by the intersection of the planes (16-8).
728 NUMERICAL ANALYSIS (CHAP. 9
When abscissas are spaced uniformly h units apart,
#i ** #o + K x 2 ** x o + 2^, * . . , Xk ** xq + kh,
and from (16-8) it follows that the approximate solutions at x Xj x 3 , * . . are
Pi “ Po + fi&o, PQ,Zo)h,
Zi « Zo +/2(zo,2/o,zo)fc,
P2 - 2/1
*2 = *1 + /»(*!, yi,«l)A,
y*+i = 2 /a
Z*+l * ** +/2(^Jfc,y/b,Zfc)^.
If, instead of approximating the solution in each interval by a linear func-
tion, we make use of the polynomial approximations in the manner of
Sec. 15, we obtain
Vm+t = y m + %■ + Vi Vy m + 5 /i2 Yy' m H h a„ TTy'j
, , o , , (16-9)
*m+l = 2m + h[z m + Y ^ z m H 1* a n ^Vm]
with a n determined by (15-7).
In computing y m +\ and z m +\ from (16-9), we must first obtain the values
of Pm, Zm, y m , z m and the required differences, as was done in Sec. 15.
Example: Obtain the solution of the system
y' - x + *
*' - l + y
(16-10)
in the form (16-7), which is such that
1/(0) - -1, 2(0) ~ 1.
On setting xo - 0 in (16-7) we get
»(i) - 2,(0) + 2,'(0)i + ^ j/"(0)r 2 + • • •
*(i) - z(0) + z'(0)x + ~ t"( 0)x 2 -) ,
(16-11)
(16-12)
the coefficients in which can be computed by differentiating (16-10) and noting (16-11).
We obtain from (16-10)
y"(x) « 1 + z'(x),
«"(*> - v'M
y"\x) - «"(*),
t'"(x) - y"(x)
y(*\x) ■■ * ( *“* 1} (x),
z (n Hx) - V <— l >(x).
(16-13)
1
SEC. 16] NUMERICAL INTEGRATION OF DIFFERENTIAL EQUATIONS 729
The substitution from (16-11) in (16-10) yields y'(0) - 1, z'(0) - 1 — 1 - 0, and making
use of these values in (16-13) we find
V"( 0 ) -l+o-l, *"(0) - y'( 0) - l,
V'"(0) - *"(0) - 1, *"'(0) - V "( 0) - 1,
» <n, ( 0) - 1, #<»>( 0) - 1.
Accordingly, (16-12) yields
% ® xfi
V(x) =-l + J + -+ -+...
X ®
z(x) — -f— n .
(16-14)
By eliminating z from the system (16-10) we see that it is equivalent to the second-
order equation
V" - y~ 2
with y(0) — 1 f y'( 0) ™ 1. Its solution is readily found to be
V * e* ~ 2, (16-15)
and from the first of Eqs. (16-10) we conclude that
z ® - x. (16-16)
The Maclaurin expansions of these solutions are precisely (16-14).
It may be instructive to compute the polygonal approximations to the solution of
(16-10) ati * 0.2 and x — 0.4
On setting the differences in (16-0) equal to zero, we get
Vm+l ** Vm 4~ Ztn-fl c= *4"
Now, if we take Xi ** 0.2, so that h * 0.2, we obtain from (16-17)
Vi - y(0.2) - -1 + (0.2)1 - —0.8,
zi * £(0.2) * 1 + (0.2)0 * 1,
since yo 1 and zq «* 0.
The exact solution (16-15) and (16-16) yields
y(0.2) * c 0 - 2 - 2 - —0.7786,
2(0.2) * e 0 - 2 - 0.2 * 1.0214.
Using yi *» —0.8, «i * 1 in (16-17), we obtain
Vi * y(0.4) « yi -f 0.2y{,
z*t « 2(0.4) « 2i 4* 0.22i.
The values of yi and z[ can be calculated from (16-10) by setting x « 0.2, 2 * 1 , and
y » —0.8. We find that
yi m 0.2 + 1 - 1.2, ti » 1 + - 1 - 0.8 - 0.2,
(16-17)
(16-18)
730
and then (1648) yield®
NUMERICAL ANALYSIS
[chap. 0
y% « —0.8 4* (0.2) (1.2) » -0.56,
2 2 - 1 4 (0.2)(0.2) - 1.04,
while the corresponding exact values are
y(0A) - e 0 * 4 - 2 - -0 5082,
2(0.4) « c 0 * 4 - 0.4 * 1.0918.
The reader is advised to obtain more accurate polygonal approximations by taking
the interval h ** 0.1 and to compare the polygonal approximations with the values given
by (16-9) in which the differences of order higher than 1 are set equal to zero.
PROBLEMS
1. Obtain from (16-7) a fourth-degree polynomial approximation to the solution of
y f « e x -f 2, d ** 4 y
with y( 0) *= 0 and z(0) «* 0.
2 . Use a polygonal approximation to compute yi, y% yt, y* for the system in Prob. 1
by taking x\ *» 0.1, *■ 0.2, x% — 0.3, x\ =* 0.4
3. Use a polygonal approximation to compute yi, yz , corresponding to x\ *=0 1,
X 2 — 0.2, xz ** 0.3 for y" — y 2 * x, with initial conditions y( 0) = 1, y'(0) » 0. fftnt
Set « z, and consider the system y' = z, z' «* x 4 y 2 with j/(0) * 1 and 2(0) « 0.
4 . Obtain the solution for Prob 3 in Maclaurin’s series.
5. Solve the system in Prob. 3 by the Adams method. Retain only the second dif-
ferences in (16-9), and use the result of Prob. 4 to start the iteration.
17. Boundary-value Problems. In many physical problems solutions
of the second- and higher-order differential equations are required which
satisfy preassigned conditions at more than one point of the interval. A
simple example of this occurs in the study of deflections of a beam supported
at several points. Problems of this sort are termed boundary-value prob-
lems to distinguish them from initial-value problems in which the conditions
on solutions are imposed only at one point.
An important feature of the boundary-value problems is that their
solutions (if they exist at all) need not be unique . 1 When the general
solution of the differential equation can be obtained, the conditions im-
posed on solutions of the boundary-value problem can usually be met by
determining the values of arbitrary constants in the general solution 2
so that the specified conditions are satisfied. How r ever, general solutions
of differential equations can rarely be written down, and one is obliged to
seek solutions of boundary-value problems by numerical methods. The
1 See, for example, our discussion of two interesting two-point boundary-value prob-
lems in Sec. 34, Chap. 1.
*This was the procedure followed in solving the boundary-value problems in Sec.
34, Chap. 1.
1
SEC. 18] NUMERICAL INTEGRATION OF DIFFERENTIAL EQUATIONS 731
methods available for numerical solution of initial-value problems require
that the integral curve be uniquely determined at the starting point and
thus do not apply to problems in which solutions must satisfy specified
conditions at more than one point. To solve a boundary- value problem
numerically one must employ laborious trial-and-error procedures utilizing
the solutions of suitable initial-value problems.
We outline briefly the procedure commonly followed in solving a two-
point boundary-value problem for the second-order differential equation. 1
Let it be required to determine a solution of
y" * f(z,y,y') (17-1)
which assumes at the end points of the interval a < x < b the values
y(a) = A, y(b) « B. (17-2)
Now, if in addition to the value y(a) = A we specify the slope y'(a)
at x = a, the solution of (17-1) is uniquely determined, 2 but this solution
will satisfy the condition y(b) = B only for some value of the slope y'(a)
which is not known. 3 Physical or geometric considerations may suggest
an approximate value of the slope, say y'(a ) = C, which is such that the
integral curve of (17-1) satisfying the conditions
yip) = A, y'(a) = C (17-3)
also satisfies the condition y(b) ~ B.
The procedure used in solving the boundary-value problem consists in
actually constructing the solution y = y(x) satisfying the conditions (17-3)
and computing the value of y{x) at x = b. If it is tolerably near B, w r e
have the desired approximate solution of the boundary- value problem.
If not, we choose another value of the slope y'(a) and try again. The
procedure is clearly laborious and far from being elegant.
18. Characteristic-value Problems. Closely associated with boundary-
value problems are charartcristic-value problems. These are generally
concerned with solutions of the two-point boundary-value problems for
differential equations containing parameters.
A simple instance of the characteristic- value problem occurs in the study
of small vibrations of an elastic string of finite length. 4 When initial shape
and initial velocity of the string are specified, its subsequent displacement
1 For a more detailed discussion of such problems see W. E. Milne, “Numerical
Solutions of Differential Equations,” chap. 7, John Wiley & Sons, Inc., New York,
1952.
8 We suppose that f(x,y,y') is such that the initial-value problem has a unique solution.
8 We assume that the boundary- value problem in (17-1) and (17-2) indeed has a
solution.
4 See Chap. 6.
NUMERICAL ANALYSIS
[chap. 9
(18-1)
are fixed at x
conditions
u(x f t) is determined by solving the equation
d 2 u d 2 u
where a is a physical constant. If the string is of length l and its ends
0 and x * l, the solution of (18-1) must satisfy the end
w(0,0 «. 0, u(Jt,t) « 0. (18-2)
When we attempt to obtain solutions of (18-1) by the method of separa-
tion of variables, 1 that is, by assuming that u{x,t) is expressible in the form
» y{x)T(t), (18-3)
where y(x) is a function of x alone and Tit) is a function of t alone, we are
led to a pair of ordinary differential equations
d 2 y
+ X 2 y - 0, (18-4)
dx 2
d 2 T
0 ,
where X is a constant. This constant must be chosen so that the end
conditions (18-2) are satisfied.
From the assumed form of solution (18-3) and from (18-2) it follows
that the solutions of (18-4) must be such that
y( 0) - o, y(l) « 0. (18-5)
We thus have a two-point boundary-value problem for Eq. (18-4) with the
end conditions (18-5).
The determination of suitable solutions this time is very simple because
the general solution of (18-4) is
y « ci cos Xx + c 2 sin \x. (18-6)
If we impose the conditions (18-5) on (18-6) and reject the trivial solution
y « 0, we find infinitely many solutions
y * c 2 sin \x, (IB-7)
kv
where X = — . k *1,2,..., (18-8)
The values of X in (18-8) are called the characteristic values of the boundary-
value problem of (18-4) and (18-5), and the solutions (18-7) with appro-
priate Xs are characteristic functions of this problem.*
1 See Sec. 10, Chap. 6.
* The terms eigenvalue and eigenfunction are used by some writers to mean “charac-
teristic value” and “characteristic function,” respectively. These stem from German
words Eigemwert and Eigenfunktion, We eschew the hybrids, since this book is written
in English.
SEC. 18 ] NUMERICAL INTEGRATION OF DIFFERENTIAL EQUATIONS 733
The simplicity erf the characteristic-value problem defined by (18-4)
and (18-5) masks some important features of the general problem- These
features become clearer if we consider the determination of small vibrations
of an elastic rod of variable cross section. In this case the separation of
variables in the appropriate partial differential equation leads to the equa-
tion
d 2 r d 2 yl
fo? L ^ dz * J ~ * ° f ( 18 ** 9 )
in which p(x) and q(x) are known functions and X an unknown constant.
If the rod is of length l, with the end points at x =* 0 and x =* l, the solu-
tions of (18-9) must satisfy suitable conditions determined by the mode
of fixing the ends. If the end x « 0 is clamped, then y{ 0) « ^'(0) * 0;
if it is simply supported, then y{ 0) = y"(0) = 0; if it is free, then y"(Q) »
y"'(0) " 0. Similar conditions are imposed at the end x = l.
For definiteness, we suppose that the ends of the rod are free (a ship
floating at sea). We then seek a solution of (18-9) such that
y"( 0) - y"'( 0) « 0 , y''(l) - y"'(0 « 0 . (18-10)
Since (18-9) is a linear equation, its general solution is the sum of four
linearly independent solutions
y(x,\) =* c^i(x,X) + c 2 y 2 (x ) \) + c s y z (x,\) + c A y A (x,\) (18-11)
where X is the parameter appearing in (18-9) and the c t are arbitrary con-
stants. On imposing the end conditions (18-10) on (18-11) we get a sys-
tem of four equations :
Ciy”( 0 ,\) + £22/2(0, X) + £32/3(0, X) + £42/4(0, X) * 0 ,
£i2/i'(0,X) + c 2 y 2 (0,X) + £32/7(0, X) + c 4 yl'(0, X) = 0,
+ £ 22 / 2 ( 2 , M + £3 2 / 3 ( 2 , X) + c 4 yl(l } \) = 0,
£12/7 (2, X) + £22/7(2, X) + £32/7(2, X) + £42/7(2, X) « 0.
This system of four linear equations in the unknowns c t will have a non-
trivial solution if, and only if, the determinant D(X) of the coefficients of
the os is zero. 1 The equation
X)(X) - 0 (18-12)
is the characteristic equation , and its solutions are the characteristic values
of the problem. In general (18-12) is a transcendental equation, and its
solution poses many vexing problems. 2 Usually it is solved by numerical
1 See Appendix A.
1 An instance of a simple transcendental characteristic equation appears in Sec. 10,
Chap. 6, Eq. (10-16), in which the parameter is denoted by 0. See also Sec. 36, Chap. 1,
Eq. (36-4), where D(\) ** 0 is an algebraic equation.
734
NUMERICAL ANALYSIS
[CHAP. 9
methods. Because of the importance of characteristic equations in ana-
lyzing the behavior of dynamical systems, they have been studied ex-
tensively and there is a vast literature on the subject of numerical deter-
mination of characteristic values. 1
19, Method of Finite Differences. We conclude this chapter with a
brief description of the most commonly used method for solving boundary-
value problems in partial differential equations, known as the method of
finite differences . In this method the differential equation is replaced by
an approximating difference equation, and the continuous region in which
the solution is desired by a set of discrete points. This permits one to
reduce the problem to the solution of a system of algebraic equations,
which may involve hundreds of unknowns. Ordinarily, some iterative
technique has to be devised to solve such systems, and high-speed elec-
tronic computers have been developed largely because of the need for
coping with problems of this sort.
The main disadvantage of all numerical techniques is that they give
numerical values for unknown functions at a set of discrete points instead
of the analytic expressions defined over the initial region R. Of course,
when the boundary-value data are determined by measurements at a
finite set of points of R, the difference-equations methods may be the best
mode of attack on the problem. Any analytic technique would require
fitting curves to the discontinuous data.
We proceed to the outline of the general procedure followed in reducing
the given analytic boundary-value problem to a problem in difference equa-
tions. For definiteness let the region R be bounded by a simple closed curve
C. We seek to determine the function u(r,y) satisfying a given differential
equation in R. From the definition of partial derivatives it follows that 2
du u(x + h,y) — u(x,y)
— — ]j m
dx h~+ o h
Also, if the second partial derivatives are continuous one can show that
d 2 u u(x + h,y ) - 2u(x,y) + u(r - h, y)
m lim
dx * h-+o n
d 2 u u(r + h, y + k) — u(x + h, y) - u(x, y + k) + u(x,y )
Jim — ,
dx by o hk
and so on.
For small values of h and k the partial derivatives are nearly equal to
1 For bibliography see Milne, op. cit., and F. B. Hildebrand, "Introduction to Nu-
merical Analysis/’ McGraw-Hill Book Company, Inc., New York, 1956.
*See Chap. 3, and Chap. 6, Sec. 21.
f
SEC. 19] NUMERICAL INTEGRATION OF DIFFERENTIAL EQUATIONS 735
the difference quotients appearing in the right-hand members of these
formulas. If one replaces derivatives in the given differential equation
by difference quotients, there results a difference equation which is a good
approximation to the given equation when h and Jc are small.
Thus, to Laplace’s equation
V 2 u
d 2 U d 2 u
— - -| = 0 ,
dr 2 dy 2
there corresponds the difference equation
Ax.eR "h A^u — 0)
where
&xxU » [u(x + h,y) - 2u(x ,y) + u(x - h } y)],
1
A uv u a -- [v(:r, y + h) - 2 u(x f y) + u(x, y - h)].
In a difference equation the values of u(x,y) are related at a set of discrete
points determined by the choices
of h and k. Ordinarily these
points are chosen so that they form
a square net 1 with specified mesh
size h.
The usual procedure is to cover
the region R by a net consisting of
two sets of mutually orthogonal
lines a distance h apart (Fig. 14)
and mark off a polygonal contour
C r so that it approximates suffi-
ciently closely the boundary ('.
The domain R r in which the solu-
tion of the difference equation is
sought is formed by the lattice
points of the net contained within
C f . The assigned boundary values
on C are then transferred in some manner to the lattice points on C\
When the lattice points on C do not coincide with points on C, the
desired values can be got by interpolation. 2
1 Rectangular, polygonal, and curvilinear nets aie also used. See, for example,
D. Y. Panov, "Handbook on Numerical Solution of Partial Differential Equations,”
Moscow, 1051, which contains a good account of the difference-equations techniques.
See also Appendix to 8. Timoshenko and J. N. (toother's "Theory of Elasticity,” 1051.
*See, for example, Milne, op> nL, or L. M. Milne-Thomson, "Calculus of Finite
Differences,” 1933.
788
NUMERICAL ANALYSIS
[CHAP* 9
One then seeks a solution of the difference equation which satisfies the
boundary conditions imposed at the lattice points on C'. Usually, this
leads to a consideration of a system of a large number of algebraic equations
in many unknowns . 1
1 Further discussion of difference equations is given in Chap. 6, Secs. 26 and 27, and
in Chap, 8, Sec. 10. See also chap. 10 by T. J. Higgins in L. E. Grinter (ed.), “Numerical
Methods of Analysis in Engineering/' 1949.
The literature on finite-difference methods is extensive. An illustration of the use
of the method of finite differences in solving a boundary-value problem in Laplace’s
equation is included in I. S. Sokolnikoff, “Mathematical Theory of Elasticity/' sec. 124,
McGraw-Hill Book Company, Inc., New York, 1956, which contains further references.
t
APPENDIX
Appendix A. Determinants
1. The Definition and Properties of Determinants 741
2. Cramer’s Rule 748
Appendix B. The Laplace Transform
1 . Definition of the Laplace Transform 754
2. Some Uses of the Laplace Transform 756
3. Discontinuities. The Dirac Distribution 759
4 Additional Properties of the Transform 762
5. Steady-state Solutions 765
6. Integral Equations 767
Appendix C. Comparison of the Riemann and Lebesgue Integrals
1. The Riemann Integral 771
2. Measure 772
3. The Lebesgue Integral 774
Appendix D. Table of 4>(x) = J e~ l 12 dt .
739
APPENDIX A
DETERMINANTS
1. The Definition and Properties of Determinants. A determinant of
the first order consists of a single element a and has the value a. A de-
terminant of the second order contains four elements in a 2-by-2 square
array and has the value
o,\ a 2
h b< 2
= & 1&2 — & 2 & 1 .
(1-D
A determinant of third order is similarly defined, in terms of second-order
determinants :
0 >\ 0-2
bx 62 f>3
Cl c 2 c 3
b 2
63
b 1
+ a 3
b 1
b 2
= ax
\
c 2
C 3 1
—
Cl
c z
Cl
c 2
1
(1-2)
By analogy, a determinant of order n consists of a square n-by-n array
of elements (Xi 3 :
<*12 *
■ din
«21
a 22 *
• a 2n
<*nl
a n2 *
■ • <*„„
to which a numerical value is assigned as follows: Denoting the deter-
minant by Z), let the elements in the first row be au , and let M\ % be the
determinant of order n — 1 formed when the first row and ith column of
I) are deleted. Then, by definition,
D = a n M n - a 12 M 12 + • • • + (-1 ) 1+n a ln M ln . (1-3)
The definition is inductive; a determinant of order n is defined in terms
of those having order n — 1.
The expansion (1-3) is termed a Laplace development of the determinant
on elements of the first row. The determinant Mu is called the minor
of the element an; the signed determinant (~1) 1+ W h is the cofactor of
Uxt, More generally, the determinant Mi 3 formed when the fth row and
741
APPENDIX
742
[app. a
jth column are deleted is the minor of the element a# in this row and
column. The signed determinant
(-1)^% (1-4)
is the cofactor of a*/. It is a fundamental theorem that a determinant may
be evaluated by a Laplace development on any row or column; in other words,
an
Oi2 *
* «ln
02 1
&22 '
* 02n
n
~ OijAij =
Z <hjA
7-1
«nl
0*2 •
’ * o nn
The proof may be given by induction directly or may be based on the following
considerations, which are also established by induction. The expansion of an nth-order
determinant is a sum of the n! terms ( — l)*a* . . .a* n „, where k\, . . M k n are the
numbers 1,2, . . . , n in some order. The integer k is defined as the number of inversions
of order of the subscripts k\ t k 2 y . . k n from the normal order 1, 2, . . n where a par-
ticular arrangement is said to have k inversions of order if it is necessary to make k suc-
cessive interchanges of adjacent elements in order to make the arrangement assume the
normal order. There are n * terms, since there are n! permutations of the n first sub-
scripts, and each term contains as a factor one, and only one, element from each row
and one, and only one, element from each column.
For example, consider the third-order determinant
011
012
018
021
022
028
031
032
033
The six terms of the expansion are, apart from sign,
011022033, fl U a 32023, <*21012033, 021033013, 031012028, Og 1022013-
The first term, in which the first subscripts have* the normal order, Is called the diagonal
term, and its sign is positive. In the second term the arrangement 132 requires the
interchange of 2 and 3 to make it assume the normal order; therefore k *» 1, and the
term has a negative sign. Similarly, the ttiird term has a negative sign. The fourth
term has a positive sign, for the arrangement 231 requires the interchange of 3 and 1
followed by the interchange of 2 and 1 to assume the normal order. Similarly, the
fifth term has a positive sign. In the sixth term, it is necessary to make three inter-
changes (3 and 2, 3 and 1, and 2 and 1) in order to arrive at the normal order; hence,
this term will have a negative sign. It follows that
D « 011022033 — 011032023 — 021012033 “f 021032013 4“ 0*1012028 — 081022018.
The main result of this discussion is that a determinant is the sum of
all the nl products which can be formed by taking exactly one element from
each row and each column and multiplying by 1 or —1 according to a
definite rule.
SEC. 1] DETERMINANTS
Example 1. By a development on the second column, evaluate
10—12
I a
D
743
6 0
4 7
2 0
The ( — l)*** rule for determining sign means that the sign of the minors alternates as
we proceed from one element to an adjacent one in the same row or column, and the
sign starts with 4- in the upper left-hand corner. Thus,
D m O(-Afia) + O(Afa) + 7(-Af w ) + 0 (Af«).
Crossing out the row and column containing 7 gives the determinant M&, whence
D - — 7 JV7 32 - (—7)
1 -1 2
6 4 3
2 2 3
( — 7)f(I)(G) — ( — 1)(12) 4-2(4)]
( — 7)[(l)j
4 3
2 3
(-D
6 3
2 3
+ ( 2 )
6 4
2 2
-182.
D
Example 2. The following determinant is said to be in diagonal form. Show that its
value is abed no matter what elements are put in place of the *s:
a * *
0 6 *
0 0 c
0 0 0 d I
Successive l^aplace developments on first columns give
6
i
D - a
0 c *
0 0 d
ab
c w
0 d
abed.
Evidently, a similar result is true in general.
Example 3 If the elements are differentiable functions of t , show that the derivative
of the determinant (1-2) is
/
/
02
t
a 8
ai ag 03
Ol
02
as
h
h
63
+
K K K
4-
bi
1)2
6s
Cl
C2
cz 1
1 1
Cl C2 C 3 |
c i
c 2
*3
A typical term in the expansion is ±a t bjCi. Differentiating gives
±(a t bjc k ) r » dr a'ibjCk ± a x b’ } c k db a t 6 ; c*,
and the sum on i, j, k of these three types of terms yields the expanded form of the
three determinants. A corresponding result for determinants of order n is proved in the
same way.
The fundamental theorem (1-5) leads to some important properties of
determinants that are now enumerated.
1. If each element in a row or column of a determinant is zero , the deter -
min ant is zero .
744 APPENDIX [app. a
2, If each element in a row or column is multiplied by m, the determinant
is multiplied by m.
3. If each element of a row or column is a sum of two terms , the determinant
equals the sum of the two corresponding determinants; for example ,
ai
h
Cl
ax
h
Cl
a t
bx
Cl
a + ot
b + fi c + y
-
a
b
C
+
a
0
y
<h
b 2
c 2
<h
b 2
c 2
a 2
b 2
C2
These three results become obvious when we make a Laplace develop-
ment on the row or column in question. In (1-6), for example, let A , B, C
be the cofactors of the elements in the second row. The determinant is
(a + a)A + (b + 0)B + (c + y)C ,
which equals {a A + bB + cC) -f (c*A + $B + y C). This, in turn, is the
sum of the expansions of the two determinants on the right of (1-6).
The proof for n-by-n determinants is very similar and should be supplied
by the reader.
4. If two rows or two columns are proportional , the determinant is zero.
5. If two rows or two columns are interchanged , the determinant changes
sign.
6 . If rows and columns are interchanged , the determinant is unaltered.
The properties 4, 5, and 6 are easily verified for 2-by-2 determinants,
then proved in general by mathematical induction. To obtain item 6,
for example, expand the original determinant on elements of the first row
and the new one on elements of the first column. The theorem for order
n then follows from the theorem for order n — 1. As an illustration we
have
b 1
Cl
a 2
b*
C2
03
h
c 3
b 2 C 2
&3 c 3
—
b i
bs
c 1
+ «3
b i c x
&2 c 2
(1-7)
which coincides with the expansion (1-2) when we interchange rows and
columns of the second-order determinants on the right-hand side of (1-7).
7. The value of a determinant is unaltered if a multiple of one row (or
column) is added to another.
8. If the cofactors for one row (or column) are combined with the elements
of another y as in (1-8), the resulting sum is zero:
n n
4 ^ ftijAik 0 , ajiAfa “ 0 , k 3r^ j.
ifml i- 1
( 1 - 8 )
SBC. 1] DETERMINANTS 746
These results follow from those already established. To illustrate the
proof of item 7, we have
oi bi + mci ci a x bi c x a x mci c x
a 2 + rnc 2 c 2 ~ a 2 b 2 c 2 + a% mc 2 c 2 (1-9)
a z bz 4“ fnc z c 3 #3 ^3 c 3 03 meg C 3
by 3; and the second determinant on the right of (1-9) is zero by 4. The
reader should extend the proof to n-by-n determinants.
The result 8 follows from 4. Thus, the first expression (1-8) is the
expansion of the determinant which arises when the row
a lk) a 2k) • • Gnfc
is replaced by a^, a 2 j, . . . , a n *, and hence it is the expansion of a de-
terminant with two equal rows.
9. If two determinants A and B of order n are given and a new determinant
C is formed, the element in the ith row and jth column of which is obtained
by multiplying each element in the ith row of A by the corresponding dement
in the jth row of B and adding the products thus formed, then C = AB.
Thus, if the elements of A and B are denoted by a tJ and respectively,
then the element c XJ in the ith row and jth column of the product deter-
minant C is
c t} — a 4 i&ji + a l2 bj 2 H b a xn b jn . (1-10)
The validity of rule 9 for determinants of order n follows from considera-
tions entirely similar to those we give next for the case when n = 2.
If the determinants A and B are of second order, formula (1-10) states
that their product C is
Ull&ll 4" &12&12 G11&21 + tt 12&22
a 2lhll + a 22^12 #21^21 4“ U22&22
Since the elements in (1-11) are binomials, we can write C by using prop-
erty 3 as the sum of four determinants:
an&n 011&21 ^12^22
021&11 021&21 a 21^Xl a 22^22
012&12 a ll&21 G12&12 a l$22
a 22^12 a 2l^2l °22&12 <*22^22
On factoring out the elements an and a 2 i in the first determinant, we
obtain a determinant with two like columns, and hence its value is zero.
Similar remarks apply to the fourth determinant. The second deter-
( 1 - 11 )
APPENDIX
746
{app. a
minant, on factoring out fen and fe 2 2 , yields fenfe 2 2 A, while the third haa
the value —5^21*4. Thus
C *= A (fej ife 2 2 — fei2fe2i)
- AB .
Similar, but much more laborious, calculations can be carried out for deter-
minants of higher order to establish the validity of rule 9.
Since the value of a determinant is unchanged when its rows and columns
are interchanged, there are four ways in which the determinant C may
be written. Thus, if we interchange the rows and columns of B, the ele-
ments on in C will be given by (1-10) in which the subscripts on the fes
are interchanged. 1
Example 4. Without expanding show that
1 x x x\
l Xz x\
1 xz x\
(*i - x 2 )(x 3 - x 2 )(xi - Xz).
The determinant is a polynomial in xi, and it vanishes when x\ » x 2 , since the first two
rows are then proportional. Hence it is divisible by 2*1 — X 2 . Similarly, it is divisible
by xz — xj and xi — xz. It therefore equals
E(x 1 - X 2 )(X3 ~ X 2 )(Xi ~ Xz)
for some polynomial E. Since the determinant is of degree 3 in Xi, X2, X3, we must have
E » const, and comparing coefficients of x 2 x \ shows that E — 1.
Example 5. Write the product of the determinants
1
2
1
-1
4 2
A »=
3
0
1
and
B »
2
-1 3
' 0
2
1 1
0
2 -1
as a single determinant of third order.
Using rule 9 we find
AB
-1 +8 + 2 2-2 + 3 0 + 4-1
-3+0+2 6-0+3 0+0-1
0+8+2 0-2+3 0+4-1
9 3 3
-19-1*
10 1 3
To check the result, we find on expanding the determinants A } B t and AB that A * —2,
B « 21, and AB * —42.
Example 6. Show that a trigonometric polynomial
y « ai sin x + a% sin 2x + sin 3x
1 Cf, Eq. (16-3), Chap. 4.
(M2)
SEC, 1] DETERMINANTS
passing through three assigned points {z % ,y t ) is given, in general, by
y sin a: sin 2x sin 3x
yi sin xi sin 2xi sin 3xi
2/2 sin sin 2xj sin 3x2
y% sin x$ sin 2 x 3 sin 3xa
0.
Expanding on the first row gives
747
c\y -f* ci sin x -f C 3 sin 2x 4* c% sin 3x — 0
where c* are the appropriate cofactors. Hence y has the form (1-12), if c\ 9 * 0. More-
over, when x « x, and y « y t the equation is true, since two rows of the determinant
are then equal.
PROBLEMS
1. By a Laplace development on the first row, evaluate
1 2 3
— 1 2 2
1 0 0
1 2 3
3 1 2
»
-3 6 6
>
0 0 1
,
4 5 6
2 3 1
5 7 9
0 10
7 8 9
2. Evaluate the determinants in Prob. 1 by a Laplace development on (a) the first
column, (6) the second row.
3 . Evaluate this determinant by development on
(а) The first column
(б) The second row
(c) The first row
( 1 d ) The third column
1 -1
0 1
0 0
1 0
1 -1
-1 1
1 -1
0 1
4 . Show that
zi 1
x 2 1
Xj
V\
1
V 2
1
^8
ys
1
represent, respectively, the (signed) length of the segment (xi,X 2 ) and the area of the
triangle with vertices (x,,y»).
6. Evaluate, using some of the properties 1 to 7:
X 1 1
y 4- z x x
0 -a -b
1 X 1
1
y x + * y
}
0
1
r>
1 1 X
z z x 4- y
be 0
Hint' In the last determinant, interchange rows and columns.
6. Write out as determinants of third order the product of the first determinant in
Prob. 1 by the second and third determinants.
7. Using determinants, find a, b, c if y ■» a -f b cos x + c cos 2x passes through (0,0),
(r/2,1), (*•, -2).
8. (a) Find a cubic containing the points (0,1), (1,-1), (3,4), (4,0). Hint: Consider
a determinant with top row y, 1, x, x 2 , x 3 . (6) Write the equation of a polynomial of
degree n whose graph contains n + 1 assigned points (x*,y t ).
748
APPENDIX
[APP. A
8. Cramer’s Rule. Consider the set of simultaneous equations
a\X + b\y + C\t *=
<hx + b 2 y + c 2 z ® d 2t
dzx + b$y + c 3 z » da.
Now, by 2 and 7 of the preceding section,
(2-1)
ai
61
Cl
aix
h
Cl
diX + b x y + c x z
b 1
Cl
X
a 2
bi
c 2
-
a 2 x
62
C 2
a^x + b%y -|- dz
62
c 2
a 3
b 3
c*
&3
cz
a 3 x + b 3 y + c 3 z
b 3
Cz
Hence if x satisfies (2-1), it is necessary that
ai
h
Cl
d 1
b 1
Cl
X
a 2
^2
C2
d 2
h
C 2
^3
bs
C 3
d 3
b 3
c 3
(2-2)
The determinant on the left of ( 2 - 2 ) is termed the coefficient determinant
of the system ( 2 - 1 ); we denote it by D. Equation ( 2 - 2 ) and the cor-
responding relations for y and z may then be written
(2-3)
If D 5^ 0 , we may divide by D to express x, y ) and 2 as quotients of two
determinants.
xD —
d\ bi Ci
d 2 b 2 c 2
■ yD =
&i d\ Ci
a 2 d 2 c 2
II
d\ b\ d\
cc 2 b 2 d 2
d$ b% c 3
\ <^3 d 3 C3
&3 d 3
To show that these values of x t y, and z actually satisfy the system (2-1), substitute
into (2-1) and multiply through by D. The equations become
d\
bx
°i ;
ai
di
C\
|
bi
di
1 «i
b 1
Cl
Oifcj
<k
b 2
02
+ 6*j
02
d 2
C2
4* c k
b 2
di
! - *
02
bi
C2
di
b,
Cl I
a%
di
Cl
ai
bi
di
0 *
b.
Cl
with k «■ 1 , 2 , or 3 respectively. Now, the determinant
au bk Ck dk
ai bi a d\
h <% d%
<H b§ ci dd
jft *ero because twp rows are equal, and it yields the desired relation when expanded on
elements of the first row (use Theorem 5 of the preceding section).
BBC, 2] DETERMINANTS 749
The foregoing method applies to n equations in n unknowns, and yields
Cramer’s Rule: Let 1
Oll^l + 012^1 + • * * + d\ nX n = &l,
d 2 \Xi + a 2 2^2 d 1- d 2n x n = k 2l
(2-4)
Gnl%l + d n 2 X 2 H f" d nn X n *= & n
be a system of n equations in the n unknowns x x such that the coefficient de-
terminant D is not zero. The system (2-4) has a unique solution x t = D,/Z),
where D t * is the determinant formed by replacing the elements a Ul a 2l , . . . y a n %
of the ith column of D by k x , k 2y . . k n respectively.
Consider the homogeneous system which arises from (2-4) when the
right-hand members are replaced by zero. This system obviously has a
solution X\ = x 2 = * • • = x n = 0, the trivial solution. If the coefficient
determinant is not zero, the solution is unique by Cramer's rule. Hence
a homogeneous system can have a nontrivial solution only if the coefficient
determinant is zero. One can prove, conversely, that there is always a
nontrivial solution of the homogeneous equations if the determinant is
zero.
The rectangular array
( ai bi Ci di\
a 2 b 2 c 2 d 2 I (2-5)
a 3 h c 3 d-J
is termed the augmented matrix of the system (2-1). By striking out one
or another column of the matrix (2-5), we are led to the square arrays
'dl
b ,
Cl \
a 2
7
&3
cj
Since these arrays are square, they have corresponding determinants.
Now (2-3) shows that all these determinants must be zero if D * 0 and if
the system (2-1) actually has a solution. In other words, if D * 0 but
a third-order determinant formed from (2-5) is not zero, then the system
(2-1) is inconsistent.
The foregoing results are included in a general theory of linear systems,
which is now discussed. An ra-by-n matrix is a system of mn quantities
a x j, called elements, arranged in m rows and n columns. The array is eus-
1 A compact derivation of this rule is given in Sec. 15, Chap. 4.
[app. a
750 APPENDIX
tomarily enclosed in parentheses, thus;
Mil
a 12 *
’ ’ a ln \
A SE
f <*21
1 * *
<*22 *
* * «2n
\<*ml
a m2 *
’ ’ <*mn/
If m «= n, then A is the coefficient matrix of the system (2-4) ; the augmented
matrix is obtained by adjoining a column with elements (in order) k\,
A' 2 , . . . , k n . If the matrix is square, one can form the determinant of the
matriXj a determinant whose elements have the same arrangement as
those of the matrix. From any matrix, smaller matrices can be formed
by striking out some of the rows and columns. Certain of these smaller
matrices are square, and their determinants are called determinants of the
matrix. A matrix A is said to he of rank r if there is at least one r-rowed
determinant of A that is not zero, whereas all determinants of A of order
higher than r are zero or nonexistent. (The latter alternative arises if r
equals the smaller of the two numbers m and ft.) The rank is zero if all
elements are zero. With these preliminaries we can state the following
Fundamental Theorem: Suppose we are given a set of m linear equations
in n unknowns. Let the rank of the coefficimt matrix he r, and let the rank of
the augmented matrix he r' . If r ' > r, the equations have no solution . If
r' ss r = n, there is one, and only one , solution. If r' = r < n, we may give
arbitrary values to n — r of the unknowns and express the others in terms of
these .
The proof is too long for inclusion here. Important special cases were
established, however, by the proof of Cramer’s rule and by the discussion
of (2-5). Further discussion of matrices is given in Chap. 4.
The r unknowns which are expressed in terms of the others must be asso-
ciated with some nonvanishing determinant of order r.
Example 1. By Cramer’s rule, find x and y, given
3x 4" y 4* 2s; «* 3,
2x — 3y — z ** 3, (2*6)
x ■+* 2y + z » 4.
The coefficient determinant D is found to be 8, so that
3
1
2
3
3 2
Sx ~
-3
—3
-1
-8,
Sy «
2
—3 -1
4
2
I
1
4 1
Thus, x m 1, y m 2. If z is desired, one can find it from the third equation (2-6):
z » 4 — x — 2j/-4 — 1— 4 — — 1.
BBC. 2 ] DETERMINANTS
Example 2, For what values of X do the equations
751
a 2 * + b 2 y ~b C 2 = 0.
Let (x y y) be the point at which the three lines meet. With this particular choice of
x and y, the three equations (2-12) are satisfied simultaneously. Now, these equations
may be regarded as simultaneous equations in three unknowns x , y t and 1, one of which
(namely, 1) is not zero Hence the coefficient determinant vanishes:
a b c
°i b i c\ ** 0.
a 2 b% C2
752
APPENDIX
[AFP, A
(The condition is also sufficient if no two of the three lines are parallel The reader
should observe the duality between points and lines which is illustrated by this and
the following example.)
Example 5. Find a necessary and sufficient condition that the three points (x,y),
(xj^s) lie on a line.
If the equation of the line is
ax 4* by 4* c *» 0 (2-121)
we have, besides (2-13),
oxi + byi 4* c » 0,
axt + f >|/2 4- c « 0.
These equations may be regarded as a system in the unknowns a, b t c, which cannot
all vanish if (2-13) represents a line. Hence the coefficient determinant must vanish:
x y 1
xi y x 1 « 0. (2-14)
x% yi 1
Conversely, (2-14) ensures that the system has a nontrivial solution a, b , c. Compare
Prob, 4, Sec. 1 .
Example 6. Show that the following equations are consistent if, and only if, k « 9:
2x 4* 3y » 1,
* - 2y - 4, (2-15)
4 x ~~ y ** k.
The coefficient matrix has rank 2, and hence the equations are consistent if, and only if,
the augmented matrix also has rank 2. This entails
2 3 1
1 -2 4 - 0,
4 -1 k 1
which yields 1(7) — 4( — 14) + k(— 7) — 0 or k — 9. The same result is found if we
regard (2-15) as a system in the three unknowns x, y, k and solve by Cramer’s rule.
The reader should obtain the result of Examples 3 and 4 by considering the augmented
matrix, as in the present example.
PROBLEMS
X. Solve, by Cramer’s rule, the systems:
(a) x 4“ 2y ~h 3s «■ 3, (b) 2x 4- y 4* 3z ** 2,
2x — y -f z » 6, 3x - 2y — 2z * 1,
3x4 - y ~ z ** 4; x — y 4 - * - — 1;
(c) x 4- 2y « 1, (d) 2x + y 4- 3z 4- w * -2,
2x - y — 2z ® 3, 5x 4- 3y — x — w * 1,
—x 4* V + 8* ** 2; x — 2y 4- 4x 4- 3w? » 4,
3x - y 4- 2 * 2.
SBC* 2] DETERMINANTS
2. Obtain nonxero solutions when they exist.
753
(a) x 4* 3y — 2* * 0, (b) x — 2y » 0,
2a? — y 4* * ** 0; 3x 4- 3/ •*
2x — y *» 0;
(e) 3x — 2y + z - 0, (d) 2x — 4y 4- 3z ** 0,
x 4- 2y — 2z *® 0, x + 2y - 2z - 0,
2x - ?/ 4- 2z « 0; 3x - 2y + * - 0;
(e) 4x - + z * 0, (/) x + 2y + 2z - 0,
2x — y -f 3z « 0, 3x - y 4- z * 0,
2x — y — 2z *» 0, 2x 4~ 3y + 2z « 0,
fix - 4* 4z « 0; x 4- 4y ~ 2z = 0.
3. Investigate the following systems and find solutions whenever the systems are
consistent:
(a) x - 2j/ - 3, (b) 2* 4- y - * - 1,
2x 4- y «* If x — 2y 4~ z *■ 3,
3x — y » 4; 4x - 3^ 4- « * 5;
(c) 3x 4- » 4, (d) 2x — y 4- 3z «* 4,
x - * 1, x 4- V - 3z » -1,
2x 4- by ** — 1 ; 5x - 2 / 4* 3z *» 7.
4 . (a) Give a necessary and sufficient condition that four points in space be coplanar.
(6) Give a necessary and sufficient condition that four planes be concurrent.
5. As in Example 5, find a necessary and sufficient condition that four points lie on a
circle,
6. Give a relation which the coefficients must satisfy if
ox 4 5 6 7 8 9 10 4- &x 2 4- cx 4- d « 0,
ax 2 4- /3x 4- 7 w 0
have a common root.
7. Give a condition on the coefficients of a general cubic f(x) if it has a double root.
and /'(x) have a root in common.
8. The system ax 4- by « c, ax 4- fly ** 7 represents two lines which may intersect
at one point, may be parallel, or may coincide. Discuss the system geometrically, and
thus obtain all the relevant results involving rank. Hint : Begin by showing that the
lines are parallel if, and only if, the coefficient determinant is zero.
9. An equation ax 4“ by 4~ <% “ d represents a plane, and two planes are parallel if,
and only if, corresponding coefficients a, 6, c are proportional:
a ** kai, b » kb i, c *» kc\.
(You may assume these geometric facts.) As in Prob. 8, give a complete geometric
discussion of the behavior of two equations in three unknowns.
10. As in Prob. 9, discuss the general system of three equations in three unknowns.
APPENDIX B
THE LAPLACE TRANSFORM
The use of Laplace tranforms for solving ordinary differential equations
has its origin in a symbolic method developed by the English engineer
Oliver Heaviside. It enables one to solve many problems without going
to the trouble of finding the general solution and then evaluating the arbi-
trary constants. The procedure can be extended to systems of equations,
to partial differential equations, and to integral equations, and it often
yields results more readily than other techniques.
1. Definition of the Laplace Transform, The function F(p) given by
F(p) « f* f(x)e~ px dx « L(/) (1-1)
-'o
is called the Laplace transform of f(x), and the operator L that transforms
/ into F is called the Laplace transform operator. The operator L is linear ;
that is,
L(f+g) = L(f) + L(g), (1-2)
Ucf) - cL(f), (1-3)
where c is any constant. Indeed, the definition of L shows that (1-2) is
equivalent to
[ tf( x ) + g(x)]e~~ pz dx « f f(x)e~ px dx + f g(x)e~ pz dx
Jo Jo Jo
and this is a familiar property of integrals. The proof of (1-3) is similar.
To illustrate the calculation of a Laplace transform let f(x) = e az , where
a is constant. The transform is
/ e ax e-* z dx « / <r<P~ a >* dx =
Jo Jo — (p — a)
provided p > a. When p < a, the integral diverges.
This example enables us to investigate the convergence of (1-1) for a
general function /(z), provided
(1-4)
764
sue. 1]
THE LAPLACE TRANSFORM
755
fix) is piecewise continuous 1 on every finite interval
and (1-5)
1/0*0 1 < Me ax for some choice of the constants M and a.
Under these conditions the integral converges for p > a , just as in the
foregoing example. In fact,
f* \f(x)\e" px dx < f* M c ax e~ px dx < M f* dx.
Since the latter integral has the finite value (1-4), the integral on the
left remains bounded as l —► «. This establishes not only the conver-
gence but the absolute convergence of the integral defining L (/). The
convergence is uniform if p > ao > a ) where a# is fixed, and hence the
operations we shall carry out later are justified.
The integral on the right of the foregoing inequality tends to zero as
p —* oo. This allows that
lim F{p) = 0 (1*6)
p— f 00
for all functions F = L (J) such that / satisfies (1-5). It is found, more
generally, that F(p) — > 0 if L(/) converges for any finite value p = p 0f
even when (1-5) does not hold. Hence, if lim F(p) 5 * 0 as p 00 , then
F{p) cannot be the Laplace transform of any function /(r).
Example 1. Let f(x) ** x b . The change of variable t = px yields
P
x b e pl dx
According to Chap. 2, Sec, 14, the latter integral is convergent for b > —1 and repre-
sents the generalized factorial 6!. Hence
L(x 6 ) - for 6 > -1. (1-7)
When b is negative, x h is infinite at x * 0 and (1-5) does not hold.
By comparing the integral for L(/) with that for L (Mx b ) near the origin, one finds
that (1-5) is really needed only for x > 1, provided f(x) is piecewise continuous for
x > 0 and satisfies the additional condition
| f(x) | < Mx b on 0 < x < 1 for some constant 6 > — 1.
Whenever we take a Laplace transform L (/) in the sequel, it is understood that p > a
and that/ satisfies (1-5) or the more refined condition just described. On the other hand
it is not required that f(x) be real For example, (1-4) holds when a is complex provided
p > Re (a).
1 See Chap. 2, Sec. 25. The following discussion uses a comparison test for integrals,
which can be verified in the same way as the corresponding feet for series. Cf. Chap. 2,
Sec. 4, Theorem I, and Chap. 2, Sec. 6, Theorem I.
766 APPENDIX
Example 2. The choice a «■ ib in (1-4) yields
[APP. 8
L(e**) _ L(oos bx + i sin for) — •
p — ib
Upon equating real and imaginary parte with due regard to (1-2) we get
P
t>(coa bx) ■* —x ~zi L(sin bx)
P + b*
for all real 6. Differentiation with respect to b gives
L(x cos bx)
p 2 -b 2
(p 2 +bY
L(x sin bx)
b
P ! + 6*
2 bp
(P* + 6*) J
( 1 - 8 )
(1-9)
Proceeding in this fashion one can construct a table of transforms, such as Table 2
given at the end of this appendix. Indeed, we have already derived entries la, 2a, 26,
36, and 4a of Table 2; and entry 3a can be obtained from (1-8) and (1-9), since L is linear.
2. Some Uses of the Laplace Transform. If L[f(x)] = F(p), integration
by parts leads to
L[f(*)] = pF(p) -m (2-1)
provided the hypothesis (1-5) applies to f'(x) as well as tof(x). That is,
00
f e~ p *f'(x) dx = e~ px f(x) + f pe~ px f(x) dx. (2-2)
A) o ■'0
For sufficiently large p Eq. (1-5) shows that e~ pz f(x) — > 0 as x oo, and
the desired result follows.
The choice /(x) — y in (2-1) gives
My') = pL(y) - y( 0 ) ( 2 - 3 )
and the choice f(x) = y' gives
My") * pl^y') - 2/'(0) ** p[pL(.v) — p( 0 )] - y\ 0 )
in view of (2-3). Hence
My") = p 2 L(^) - py( 0) - y'(0). (2-4)
The transform of the higher derivatives can be obtained similarly. For
instance,
My'") « v*My) - p 2 2/(0) - py'( 0 ) - y"{ o). (2-5)
These relations enable us to solve differential equations with constant
coefficients.
As an illustration consider the problem
y" + y * /( 0 » 2/(0) “ y'{Q) - 0 , ( 2 - 6 )
which describes the response of a resonant circuit to an input /(/) . To make
the problem definite let/(2) * 0 for t < 0, but f(t) « 1 for t > 0. (A switch
SEC. % THE LAPLACE TRANSFORM 757
to a constant-voltage source is closed at time t « 0 and remains closed there-
after.) The transform of (2-6) gives
p 2 Uy) + L(y) « L(/) - p~ l
when we use (2-4) and the entry la of Table 2 with a
Uy),
Hy)
l
vir + 1 )
0. Solving for
It can be shown that a continuous function y is determined on (0,«>)
as soon as its transform L (y) is known. Hence, the foregoing equation
contains the solution implicitly. To find the solution explicitly we use
partial fractions; thus
v 1 V
Uy) T7™
V P + 1
The entries la and 26 of Table 2 give the desired answer
y = 1 — cos i for t > 0, y = 0 for t < 0. (2-7)
It is an especial merit of the Laplace transform that the initial conditions
are satisfied automatically. In the foregoing illustration we did not find
the general solution and then determine the constants so that y( 0) = y'{0)
— 0. Nevertheless, the expression (2-7) satisfies these conditions, as the
reader can verify.
To illustrate further the introduction of initial conditions we shall solve
y'" -y' ~ sin x (2-8)
subject to
1/(0) = 2, y\ 0) = 0, 2/"(0) * 1. (2-9)
The Laplace transform of (2-8) yields
p z L(y) - 2p 2 - l ~~ \pL(y) - 2] = L(sin x) » (p 2 + l)~ x
when we use (2-5), (2-3), and entry 2a of Table 2, Solving for L (y) t
Uv)
By oartial fractions
Uy) -
2p 2 - 1
+
(p 2 + i)(p 3 - p)
3 V
+
4 (p + 1) ' 4(p - 1) ' 2(p 2 + 1)
and entries la and 26 of Table 2 give
V “ He~ x + %e* + H coax.
768 APPENDIX [app. b
The Laplace transform can also be used to solve systems of differential
equations. As an illustration, let it be required to find y if
y' + 2z' -f y — z - 25,
2y' + 2 = 25 e 1 ,
with the initial conditions
y(0) = 0, *(0) = 25.
The transform of (2-10) leads to
P L(y) + 2[pL(z) - 251 + L(y) - L (z) =
25
2pHv) + L (z) =
which simplifies to
(p + l)L(p) + (2 p - l)L(z)
25
p - 1
25 (2p + 1)
Solving for L (y), we get
Uy)
p
2pL(p) + L(z) = 25(p - l) -1 .
25
4 p(p - l) 2 (p +
( 2 - 10 )
(2-11)
_ 25 9 5 16
p p - i + (p - i) 2 p + H
According to' entries la and 15 of Table 2,
y = 25 - 9r* + bjcc x - Hie-*' 4 .
It should be noted that this method enables us to find y without finding
z. Also no extraneous roots are introduced, and the initial conditions are
satisfied automatically.
PROBLEMS
1. If y satisfies y " — 3 y r *f 2y — 4, y(0) « 2, y'(0) ■» 3, show that
Uv)
2 p 2 - 3p + 4
p(p ~ 1)(P ~ 2)
Deduce that y « 2 — 3e x -f 31? 2 *.
2 . Solve by means of the Laplace transform
V " + 4y * sin z, y(0) - 1, y'( 0) - 0.
8. Find L(y), and solve
8'" + *"-«*+* + 1, 2/(0) - |/'(0) - y"(0) - 0.
759
tSEC, 3] THE LAPLACE TRANSFORM
4 . Find L(«) in Eqs. (2-10) and (2-1 1) of the text, and deduce that
2 « 336* - lOxe* -
6. Solve by means of the Laplace transform and check by substituting into the given
system:
V' + 3y -f 2 ' 4* 2z - e~ 2z , y( 0) • 0,
2y' -F 2y 4- *' + 2 • 1, *(0) * 0.
6. Find y, given that
y' -f z' « z' -f v) f » w' + y' *=* y, y(0) ** 2(0) = tt>(0) « 1.
7. If /'(j) satisfies (1-5), show that fix) satisfies a condition of the same type, though
perhaps with a different value of a. Hint: f(x) « / f'(t) dt -f /( 0).
J 0
3. Discontinuities. The Dirac Distribution. Closing a switch in an
electrical circuit introduces a discontinuity in the corresponding input
function [cf. the discussion of (2-6)]. A disconti-
nuity may also be produced by a sudden impulse
in a mechanical system. The Laplace transform
is a most effective means of dealing with such situ-
ations, because the transform of many discontinu-
ous functions is just as simple as L(e*) or L(sin x).
Tn this section we shall consider the response
of a system to an impulse function which acts
over a very short time interval but produces a
large effect. The physical situation is typified by
a lightning stroke on a transmission line or by a
hammer blow on a mechanical system.
To formulate the idea of an impulse, let a be
a small positive constant and let S a (x) be the function illustrated in
Fig. 1. That is,
6 a (x) = a~~ l for 0 < x < a
and S a (x) =* 0 elsewhere. The Laplace transform is
L[5 a (x)] * \ a a~ l e~ px dx « (pa)~ l ( 1 - eT pa ).
J 0
By the Taylor series for e pa
“ 1 — M(p a ) 4 — 1
as a — ► 0 . It is customary to introduce an expression 6(x) which is thought
to be the limit of 6 a (x) as a — ► 0 and to say that
m*)] - 1 . (3-i)
We call 5(x) the Dirac distribution or the unit impulse , and we take (3-1)
as the basic defining property. The legitimacy of this procedure requires
APPENBIX
760
[app. b
further discussion, which will be given presently. First, however, the use
of S(x) will be illustrated by an example.
The displacement y of a weight suspended by a spring with stiffness 1
is determined by the system
y” + v « f(t), y{ 0) - v'( 0) « o,
where f(t) *» force function
t « time
' * d/dt.
To determine the response to a unit impulse at t « 0 we replace f(t) by
5(0; thus
y" + y « *(Q-
The Laplace transform yields
p 2 1(v) + Uy) = L[3(0] - l
when we use (3-1). Hence L {y) « 1/(1 + p 2 ), or
y -» sin t, t > 0.
The initial conditions require y = 0 for t < 0, and the graph has the ap-
pearance illustrated in the accompanying Fig. 2.
The function y is continuous,
but it is not differentiable at
t - 0. Thus, the initial condition
y'(0) — 0 is not satisfied. Indeed,
y'(l) — ► 0 as t 0 through nega-
tive values, but y'(i) = cos t 1
as t — ► 0 through positive val-
ues. The unit impulse produces
a jump, of magnitude 1, in y'{t).
To investigate the meaning of the foregoing result, we solve
y" + y ~ 8 a (t), 2/(0) ** y'(0) - 0
and then let a —► 0. The general solution is
y « co sin t + ci cos i, t < 0,
y « C 2 sin t -b c 3 cos t *b a - * 1 , 0 < t < a,
2/ *» C4 sin f 4* C5 cos t, t > a.
By the initial conditions,
Co * Cl * Cj * 0, C8 «*
a
To determine c* and c* we require that y and y’ be continuous at t «* a. This gives
—a"* 1 cos a -b a~ l * Ci sin a -{- c* cos a,
a - " 1 em a ~ C4 cos a — c* sin a.
Fig. 2
761
SEC, 3] THE LAPLACE TRANSFORM
Hence Ci ** o'* 1 sin a, c& ** a^coe a — 1), and our solution is
y ~ 0, t < 0,
y «* n~ l (l — oos t), 0 < t < a,
y » a~ l sin a sin t — cT^fl — oos a) cos t, a <t
Sinoe a"* 1 ^ ~ cos t) < o“ x (l ~ cos a) 0 as o 0, and since a” 1 sin a — ► 1 as
a —» 0, we see that letting a — » 0 gives the solution which was obtained previously by
the method of Laplace transforms.
Although h(x) is often called the “Dirac delta function,” it is not a
function. Indeed, we have already observed that L (/) — ► 0 as p — ► qo for
every function /, and 8 does not have this property, because L($) « 1.
It is possible to generalize the concept of function and to generalize, cor-
respondingly, the definition of L. The process leads to a branch of mathe-
matics known as the theory of distributions. 1 In this theory manipulations
with 8(x) of the type carried out in the foregoing discussion are fully
justified.
Although a brief and correct definition of the unit impulse 8(x) is not
easily given, it is easy to define what is meant by the response of a system
to the unit impulse. Namely, find the response to the function 5 a (x), as
in the foregoing example, and then let a — ► 0. The Laplace transform
gives the result of such a calculation directly, without introduction of
8 a (x).
PROBLEMS
1. The voltage V of a certain circuit satisfies
V" + 4V' + 3F - E(l)
where E is the applied voltage. Find the response of the system to a unit impulse at
t m 0 if V »» 0 for ( < 0.
2. (a) Solve the equations
y' - «(*), y" - «(*), V m - B(x)
assuming that y » 0 for x < 0 and that ij and a& many derivatives as possible are con-
tinuous. (6) Show that y, y', and y " have a jump of value 1 at x « 0 in the three cases,
respectively.
3. A certain function U{x) satisfies
aHJ" -p 2 U - -**«(*), * >0
where a and 0 are positive constants. It is known further that 17(— x) U(x) t U is
continuous, and U 0 as x — ♦ <*>. Obtain the solution
U « (2 *0)“V W « )W .
Hint : In forming L(U"), take U(0) * c } U'{ 0) = 0 where c is a suitably chosen constant
l L. Schwartz, “La th^orie de distribution,” Hermann & Cie, Paris, 1950, See also
B. Friedman, “Principles and Techniques of Applied Mathematics,” chap. 3, John
Wiley & Sons, Inc., New York, 1956.
APPENDIX
[APP. 8
782
4. The singular solution u(z,() for heat conduction satisfies
tt< - oVm, u(*,0) - >$«(*)•
(a) If U(x,p ) — L(u), where the transform is with respect to t, show that <**17** — pV
m — (6) Using the result of Prob. 3 followed by Table 2, deduce that
u(*,t)
-—x®/ (4a*<)
2«M)*
/7raL* The role taken by £ in the table is taken by t in this problem.
4. Additional Properties of the Transform. The usefulness of the La-
place transform is greatly increased by the properties tabulated in Table 1.
Entries la, lb, and 4a were derived in the foregoing discussion, and the
others will be derived now. To deduce the relation 2 a we have
[fix - c)e~ px dx = [j(t)e- p(t+c) dt = e~ pc [j{t)e~ pt dt
upon setting t = x — c. The limits (— o) can be changed to (0,°o) if
f(t) « 0 on the interval (~c,0), and 2 a follows. In particular, 2a holds
if c > 0 and f(t) = 0 for t < 0. The relation 2b is simply the identity
f *-(->*/(*) dx = (V p V*/(z) dx.
Jo J o'
This is valid without restriction on c, provided p is large enough.
For 3a we let i « cx to obtain
[f(cx)e~ px dx = [f(l)e~ (p,c)t d Q - ^ F Q
as desired, provided c > 0. Writing 1/c instead of c in 3a gives 3b, again
for c > 0.
The result 4b follows by differentiating (1-1). For 5a we apply 4a to
the function
fi(x) - ( 7(0 dO
Jo
noting that f x (0) = 0 and that /!(«) = /(x) at points of continuity. The
result 5b follows from integration of (1-1).
The convolution theorem t item 6 in Table 1, can be established by the
following device: Since the Laplace transform involves f(x) only on the
range (0,<»), we can agree to take fix) = 0 for all negative x , With a
similar convention for g(x), the respective Laplace transforms may be
written in the form 1
, 1 Transforms of the type (4-1) are called bilateral , in contrast to the unilateral trans-
form (1-1). An account of the bilateral Laplace transform may be found in B. Van
der Pol and H. Bremmer, “Operational Calculus/’ Cambridge University Press, London,
1950.
me. 4}
THE LAPLACE TRANSFORM
763
Uf) - f e-* x f(x) dx, L(g) - f* dx, (4-1)
*—90 * —00
and the function A(x) of Table 1, entry 6, is equal to
h(x) = r f(t)g(x ~ k) dt (4-2)
* — 00
Indeed, the lower limit — » in (4-2) may be replaced by zero because /({)
* 0 when £ is negative, and the upper limit may be replaced by x because
g(x — £) = 0 when x — £ is negative. Given (4-1) and (4-2), the convolu-
tion theorem L (h) = L(J)L(g) can be proved by a discussion which is
practically identical with a discussion given previously, and hence we do
not repeat the argument here. 1
Example: Periodic Functions. Let Pq(x) be the function illustrated in Fig. 3, bo that
Po(x) « 1 for 0 < x < a, P 0 (x) = 0 elsewhere.
Direct computation gives the transform
LlPo(x)] - /* e~**dx « p“Hl - e~* p ).
J o
If the function is translated c units to the
right, as shown in Fig. 4, the result is
h[P 0 (x - c) 1 * p~\ 1 - e-* p )e~ pc (4-3)
by Table 1, entry 2 a. Upon choosing c » 0,
c m b, c » 26, c » 36, ... and adding, we get
a square wave * y **= P(x). According to (4-3)
the Laplace transform is
1 _ f ~-ap
UPix)} - P~'{ 1 - «-° p )(l + e-** + e~ lpi + e- Jp6 + • • •) -
when we recall the formula 1/(1 — r) for sum of a geometric series (Chap. 2, Sec. 1).
1 See Chap. 6, Sec. 18. In the present case the integrals are absolutely convergent,
the change in order of integration is justified, and the prooess actually gives a valid
proof.
* See Fig. 5. It is left for the reader to sketch the graph when b «■ a and when b < a.
764 appendix {app. b
The procedure just described can be applied to any periodic function P(x) and yields
the formula
L[P(*)] « (1 ~^~ l L[P 0 (*)]
where 6 is the period and where Po(x) * P(s) on 0 < % < b but Pq{x) » 0 elsewhere.
For example, the reader can verify that the transform of the sawtooth wave shown in
Fig. 6 is
(1 - e-**>)~ l ab- l p-*l 1 - e-**(l + bp)). (4-4)
PROBLEMS
1, Find a function f(x) whose transform is
2p — 5
3 p* + 12p + 8
Hint: By completing the square the expression can be written in the form
2p - 5 2(p 4- 2) - 9 2 p + 2 3
3(p + 2)* - 4 " 3(p + 2) 2 - 4 “ 3 (p *f 2)* - % (p + 2) 2 - %
Use Table 2, entries 2a and 2b, with a «* 2i/y/& (see also Table 2, entry 7). Then
use Table 1, entry 2b, with c » —2.
2 . As in Prob. 1 find a function whose transform is
8. Derive Table 2, entry lb, from Table 2, entry 4a.
4. Derive Table 2, 5a, from Table 2, 56, and Table 1, 46.
5, Derive Table 2, 46, from Table 2, 2a, and Table l, 56.
765
BBC. 5] THE LAPLACE TBA.NSFOBM
6. If f(x) m **~t ami g(x) «■ x*~ l , show that
/ * g - x a+i ~ l / V*‘( 1 - <)‘~ l di.
Hint: Let £ «* to in the definition of f*g (Table 1, entry 6).
7. From Prob. 6 deduce the Euler formula for the beta function
f— *(1 - O*- 1 di
(a ~ \)Kb - 1)1
(a + 6 - 1)1
Hint: By the convolution theorem applied to the result of Prob. 6,
6. Steady-state Solutions. The Laplace transform will now be used to
solve the general linear equation with constant coefficients,
y {n) + On-iV (n ~ l) H 1" «i y' + aoy = f{x) } (5-1)
subject to the initial conditions
2/(0) * 0, y'( 0) « 0, . . 0) - 0. (5-2)
The solution satisfying (5-2) is called the steady-state solution of (5-1),
because in many physical problems the effect of the initial conditions decays
exponentially as x increases.
By repeated use of Table 1, entry 4 a,
= p k L(y), (5-3)
for k = 0, 1 , 2, . . . , n, provided (5-2) holds. Hence the transform of (5-1)
yields G(p)L(y) = L (/), or
Uv) = — L (/), (5-4)
G(p)
where G(p) = p n + a n _ 1 p n_1 -| b ajp + a 0 .
Determination of y from (5-4) is especially easy when G(p) has only simple
roots pic 0. Indeed, expanding 1 /G(p) in partial fractions leads to
L(l/) = L(/)S — — - — (5-5)
V -Vk
where the A *- s are constant. 1 Since Table 2, entry la, gives
2 — = 2A*L(e p **) = L(2A*e p **),
V ~Vk
Eq. (5-5) may be written
L (y) - LCf)L(2A*e”**).
1 If we multiply through by p — p* and let p — * p*, it is found that 1/A* - O’iPk)-
766 APPENDIX (app. b
Comparing with Table 1, entry 6, gives Heaviside's expansion theorem
V J df. (6-6)
The essence of the method is that (5-4) leads to
Uv) - Uf)UQ)
provided g is a function such that 1 /G(p) = By the convolution theorem,
v - / * o - - i) dt
This formula is valid even when G(p) has multiple roots, though the determination of
g may then be more difficult.
The function g can be thought to be the steady-state solution of
9 {n) + a»_i0 (n “ l) 4 j- aig f -f aog « 6(x) (5-7)
because the transform of (5-7) yields G(p)L{g) * 1. However, since we have not de-
veloped the theory of distributions, it is better to avoid the use of S(x). This question
will be discussed next.
Let h(x) be the steady-state solution of
A (n) + a n _ x h {n ~ l) + • • • + a, A' + aoA = J(x)
where I (x) denotes the Heaviside unit function:
I(x) = 0 forx < 0, I(x ) = 1 forx > 0.
The Laplace transform of (5-8) yields G(p)L(/i) = 1/p, so that
L(h)
1
pG(p)
Writing (6-4) in the form
Uv) - pW)
we obtain
pG(p )
L(p) = (L(D +/(0)]L(A)
= pL(/)L(A)
L(f')L(h) +f(0)Uh)
(5-8)
by Table 1, entry 4o. The convolution theorem now yields
V - />«)*(* - i) di + /(0)A(x). (5-9)
Thus, the steady-state solution of (5-1) can he obtained from the steady-state
solution of (5-8) 1 by means of the formula (5-9). This important fact is
known as the Superposition principle .
767
SBC. 6] THE LAPLACE TRANSFORM
As in the derivation of (2-4) one can show that
L (y<*>) - p k h(y) - p*~VO) - p*~V(0) y (k ~ l) (0).
By means of this formula the Laplace transform can be used to solve (5-1) subject to
the general initial conditions
1/(0) - yo, V'(0) -i/i, . . y (n ~ l) (0) -
It should be emphasized, however, that the superposition principle applies to steady-
state solutions only.
PROBLEMS
1. Find the steady-state solution of
y" 4- 3y' + 2y -/(*)
by use of Heaviside’s expansion theorem.
2. Evaluate the result of Prob. 1 explicitly when
(«)/(*) - I(x), ( b)f(x ) - e a *. (c)f(x) - z.
3. By means of the superposition principle obtain the solutions (b) and (c) in Prob. 2
from the solution (o).
6. Integral Equations. An equation of the type
ff(x) = \f(x) +f Q f{Z)k{ x - £) d£ co-l)
where X is constant is called an integral equation. It is supposed that g
and k are known and that / is to be found. Because of its close relation
to the convolution theorem, this equation lends itself to analysis by means
of the Laplace transform. Indeed, taking the transform of (6-1) yields
L (jg) - XL (/) + L(/)L(*)
when we use the convolution theo-
rem. Hence
L (g)
Uf)
X 4" L(/c)
and from this, / can often be found.
The process will now be illustrated
by an example.
Starting from rest, a particle slides down
a frictionless curve under gravity (see Fig.
7). It is required to determine the shape of
the curve so that the time of descent will be independent of the starting point.
A curve of this sort is called a tauiochrone . As we shall see presently, the only
tautochrones are cycloids. 1
1 For another interesting property of the cycloid see Chap. 3, Bee. 14, Prob. 3.
708 APPENDIX [APP. B
If tike particle starts at a height y, its velocity » when the height is y\ «• e can be
found by equating potential and kinetic energies. The result is
yimfi — mg{y — ij) or v ** (2g) *(y — rj)^ (6-2)
where g is the acceleration of gravity. Denoting the arc along the curve by s, we see
that the time for descent is
/ t-O -/;«•>*
where f(ij) stands for ds/dy at y * Since the timeis constant and since t> is given
by (6-2), the problem reduces to
J^fWiv - v)~ H dv - co
where cq is constant. This is an integral equation for /.
Taking the Laplace transform gives « L(co), or
- cop-' 1 .
This gives t(f) ■» Cip~ H , where ci is constant, and hence f(y) « cy~ H , where c is con-
stant. Thus we are led to the differential equation
»-5-[' + ©T-^
If we set y ** c 2 sin 2 yfa> t a short calculation yields
x m + sin 0), y « - cos <fi)
which are the parametric equations of a cycloid.
REVIEW PROBLEMS
1. The current I in an RL circuit satisfies
L d l + Rl-V
where V « V(t) is the applied voltage. At time N 0 a switch is closed, so that V
suddenly assumes the value Vo -f A sin U. (Here L, R, Fo, A , and w are constants.)
By use of the Laplace transform find I for t > 0.
2. Find the response of the circuit in Prob. 1 to a unit impulse at time t « 0, assuming
that V * 0 for t < 0.
3. Find the steady-state solution in Prob. 1 when V is an arbitrary function by (a)
the Heaviside expansion theorem, (b) the superposition principle.
4. If L (y) « F(p) use Table 1, entry 4b, to obtain
Uxy) - -F', Uxy') - ~(pF)\ L(*y") - ~(p 2 F)' + y(0).
6. A function y satisfies xy" -f V' + xy • 0 and has a Laplace transform L(y) ** F(p).
By use of Prob. 4 show that
F'(l + p 2 ) - -pF,
and thus deduce that y — cJv(x) where c is constant.
sue. 6]
THE LAPLACE TRANSFORM
0. An insulated rod extending along the positive x axis is initially at temperature 0,
and the end x — 0 has the temperature /(f) at time L The temperature u(z,t) satisfies
u, - a*u„, u(x,0) - 0, u(0,<) - /(().
(o) If U(x,p) is the transform of u with respect to t, show that
U - L </) e -VP*/«.
[Assume that U — ► 0 as x — * and note that U(0,p ) ** L{/).]
(6) By writing U « L(/)L(^), where p is found from Table 2, deduce that
u(x,t)
2a\/
rq-f<j)dr.
J r* e ~x*/i4a*{t-
o 0 — t)*
7. Use the Laplace transform to solve some of the text examples and problems in
Chap. 1, Secs, 21 to 26.
Table 1. Phopekties of L[/(x)] ** F(p)
i
;
a
b
1
L(/ + (?) - «/) + LO?)
Ucf) - cb(f)
2
L[/(x - «)] - e~ pc F(p)
F(p - c) - He'*f(x)}
3
Ll/(cx)J = (?)
4
L[/'(*)] - pf’(p) -/(0)
F'{p) - L\ — x/(x)J
5
1 [ r /(< > ^
6
L(/)L(p) - L(/i) where A(x) - f/(£)s>(x - £) d£
^0
APPENDIX C
COMPARISON OF THE RIEMANN AND
LEBESGUE INTEGRALS
1« The Riemann Integral. Let a function fix) be given on the interval
a < x < b (Fig. 1). To define the Riemann integral
fj(x)dx (1-1)
we divide the interval [a,b] into smaller intervals by points x*,
a = x 0 < Xi < x 2 ■ • * < x n = b.
It will be desirable to consider a sequence of subdivisions which are made
finer and finer by choosing more and more points x*. The precise require-
ment is
and
max
k
X k ~ Xfc-1 1
0 .
To describe this situation we say, in brief, that the subdivision becomes
arbitrarily fine.
Let be an arbitrary point on the interval With yk » /(£*)
as shown in Fig. 1, the sum
s = Vl(*l - so) + Vnfa - *l) H 1- Vn(x n ~ X»_l) (1-2)
771
772
APPENDIX
[APP, 0
represents a certain area that presumably approximates the area under the
curve y ** /(#), The geometric interpretation suggests that s has a unique
limit independent of the manner of subdivision, provided the sub-
division becomes arbitrarily fine. When s
actually does have this behavior, f(x) is
said to be Riemann integrable , and the
limit s 0 is called the Riemann integral .
The Riemann integral does not exist if /(x)
oscillates too violently. For example, let o 0,
b » 1 and define
f(x) » 2 for x rational, 1
(1-3)
fix) ** 3 for x irrational.
It is easily shown that every interval (no matter
how small) contains both rational and irrational
numbers, so that the graph of /(x) has the appear-
ance suggested by Fig. 2. If we choose & rational,
then /(£*) = 2 and
8 « 2(xi - x 0 ) + 2 ( x 2 -xi) ■+••••
-f 2(x„ - x n _i) « 2(x» ~ x 0 ) = 2
no matter how fine the s\ibdi vision may be.
On the other hand, if the bus are all irrational,
then This shows that the limit of 8 depends
on the maimer of subdivision and, hence, that the
Riemann integral does not exist. As we shall see
presently, the Lebesgue integral for this function
does exist and can be evaluated explicitly.
2. Measure. The decisive idea in the Lebesgue integral is the notion
of measure , which will now be described. The measure of an open 5 interval
a < x < b is simply the length b — a. If a set consists of a finite collection
of such intervals (Fig. 3), the measure is the sum of the lengths. The
Fig. 3
same definition applies when there are infinitely many intervals. The
sum of the lengths is now an infinite series, but since the terms are positive,
the mm does not depend on the order of the terms (Chap. 2, Sec. 6, Theorem
III). Thus, the measure is well defined in this case also.
1 A rational number is a fraction p/q where p and q are integers. Thus and ~~ l %
are rational, but y/2 is not.
* An interval is open if the end points do not belong to the interval and dosed if they
do. Thus a < x <. b is a closed interval.
RIEMANN AND LBBESGTTE INTEGRALS
TO
SBC. 2]
The notion of measure can be extended to still more general sets E
as follows. Let / be a collection of open intervals which contains 1 E,
and let m{I) denote the measure of I. We approximate E better and better
by these sets I, so that m(I) becomes smaller and smaller. The smallest
value for m(I) which is given by this process is called the outer measure
of E and is denoted by
Strictly speaking, the “smallest value” need not be attained, and the precise definition
of outer measure is as follows: The outer measure is the largest number c such that
m(7) > c for all sets I of the above-described type. The number c is called the greatest
lower bound of the numbers rn(I) ; its existence can be established by the fundamental
principle quoted in Chap. 2, Sec. 1 .
A collection of open intervals, such as 7 in the foregoing discussion, is
called an open set. As we have seen, outer measure is defined by consider-
ing the open sets containing E. The points of [a } b] not belonging to a given
open set form a closed set. By considering closed sets contained in E one
can define the inner measure m t (E). If m,{£) = m 0 (#), the set E is said
to be measurable and the common value is called the measure of E.
To illustrate the calculation of a measure, let the set E consist of the rational points
x on 0 < x < 1, that is, the points whose coordinate x is a rational number. By taking
first the rational numbers p/q with denominator q — 1 , then those with q m 2 , and
so on, we see that the rational numbers can be arranged in a sequence
n, r 2 , r 3 , . . ., r n , .... ( 2 - 1 )
Given « > 0 , construct an open interval of length e/2 centered at r h an interval of
length «/ 2 2 centered at r 2 , and so on. Th** nth interval is of length e/ 2 " and is centered
at r n . If / denotes the set consisting of all these open intervals, then
m(/) <-+£ 5-1 ~ H ( 2 - 2 )
[We have inequality rather than equality in ( 2 - 2 ) because some of the intervals may
overlap.]
The foregoing construction shows that the outer measure of E is < e. Since « is
arbitrary, the outer measure must be zero. Because m x {E) S mo(E), it follows that the
inner measure is also zero and, hence, m(E) » 0.
As a second illustration we shall find m(E f ), where E' is the set of all irrational numbers
on [0,1]. One of the most important properties of measure is that it is additive ; if E
and E f are two measurable sets with no point in common, then
m(E + E') - m(E) + m(E').
(We use E + E' as an abbreviation for the set of all the points belonging either to B
or to E\) In the present case E is the set of rational points on [0,1], and E f the set of
irrational points on [0,1]. Evidently E + E f is the set of all points on [0,1], so that
m(E -f E*) * 1 . The above equation then gives
m(E') m l — m(E) =*1—0*1.
1 That is, every point of E is interior to one of the intervals belonging to the set 7.
APPENDIX
774
[app, c
S» The Lebesgue Integral. A function y * fix) is said to be measurable
if the set of points x at which fix) < c is measurable for any and all choices
of the constant c. It can be shown that the set e* at which yk~i < fix) < yu
is then measurable for all choices of yk~ 1 and yk< To define the Lebesgue
integral of fix), let the y axis be subdivided by points yk as shown in Fig. 4,
and form the sum
a * yimiei) + 2/ 2 m(e 2 ) H b 2/«m(e n ).
When fix) is measurable and bounded, the sum a has a unique limit <tq,
independent of the manner of subdivision, provided the subdivision be-
comes arbitrarily fine. This limit a 0 is called the Lebesgue integral of fix)
and is written in the form (1-1).
The most obvious difference between Riemann's definition and Le-
besgue’s is that in the former the x axis and in the latter the y axis is sub-
divided. This distinction, however, is superficial. The important fact is
that Riemann’s definition is based on the notion length of an interval
whereas Lebesgue’s is based on the more general notion, measure of a set .
The intervals x* — in Riemann’s definition play the same role as the
sets e* in Lebesgue ’s.
Riemann's definition breaks down if fix) does not remain close to yk
throughout most of the intervals [&*_!, x*]. Lebesgue J s definition cannot
break down in this way, because fix) is automatically close to tjk through-
out the set e*. That is why (in contrast to the former definition) the
latter carries with it an assertion that the integral actually exists.
RIEMANN AND LEBESGUE INTEGRALS
775
SEC. 3]
To illustrate the calculation of a Lebesgue integral we shall integrate the function
(1-3) illustrated in Fig. 2. If the intervals (y p -i,y p ) and (yq~i,y Q ) contain 2 and 3,
respectively (Fig. 5), then the sets e p and e q are the only ones that are not empty. Thus
m(cjk) ~ Q for k 9 * p or q, and the sum reduces to
<r « y p m(e p ) *f y g ni{e g ).
Since ep is the set of rational points and e g the set of irrational points, these sets have
the measures 0 and 1, respectively. Hence <r * y q . As the subdivision becomes
arbitrarily fine, y Q — ► 3 and the Lebesgue integral is found to be
f/(x) dx - 3. (3-1)
Jo
It can be shown that if the Riemann
integral exists, then the Lebesgue integral
exists also and the two have the same
value. On the other hand, the latter may
exist when the former does not, as we
have just seen. Because of its greater
generality the Lebesgue integral has
many desirable properties, of which we
mention the following:
Lebesgue Theorem on Bounded Con-
vergence. Suppose j/ n (x) | < M where
M is constant , suppose f n (x) are Lebesgue
integrable , and suppose lim f n (x) — f(x) o?i
an interval [a f 6]. Then fix) is Lebesgue
integrable , and
rb rb
lim / f n (x) dx= f(x) dx.
Ja Ja
To see why the theorem fails for Riemann
integrals, let / n (x) « 2 at the first n rational
points rk in the sequence (2-1) and f n (x ) » 3
elsewhere. Then |/ n (x) | < 3, and as a Riemann
or Lebesgue integral,
fuix) dx - 3. (3-2)
Jo
Fig. 5
Evidently, lim/ n (x) ■» f(x), where /(x) is the function (1-3). Taking the limit of the
expression (3-2) as n — ► », we get
lim Cfnix) dx - 3 - Cj{x) dx (3-3)
Jo Jo
provided the latter integral is the Lebesgue integral (3-1). Equation (3-3) does not
hold for Riemann integration because, as we have seen, f(x) is not Riemann integrable.
APPENDIX D
TABLE OF *(*) * ~= f* it '
x 0.00 0.01 0,02 0.03 0,04 0.05 0.06 0.07 0.08 0.09
* This table is reproduced by permission from the “Biometrics Tables for Statisti-
cians,” vol. 1, 1954, edited by E. S. Pearson and H. O. Hartley and published by the
Cambridge University Press for the Biometrics Trustees.
776
ANSWERS
Section 1, Pages 10-11
1. Ordinary, fourth order.
4. Ordinary, first order.
7. Ordinary, second order.
14. y *» x 2 ;y « x 2 ; y « x 2
x 8 x 8
15. y _ -g- -f x;y - — + t
CHAPTER 1
2. Partial, fourth order,
5, Ordinary, second order.
8. Ordinary, third order,
-f 1 ; y « x 2 — 2.
JlX + C2.
3. Ordinary, first order.
6. Partial, second order.
Section 3, Pages 13-14
1 . 3 ^.
3. p *» ~/c; a *
6. v * 30e“*‘; »
7. 2 in.
- k).
- J ( 1 -•-*>.
4. (log 2)/2.
6. 2 32 hr; qo.
Section 4, Page 16
2. The rate g would be thought of as/(0<7 instead, where /(<) « 0 for 0 < t < and
/(f) « 1 for f > fo* Equation (4-5) would be written dx/dt » wr *-f- /(f)<? — rx/p.
3. dx/dt * wr + fcAoe"* 4 - rx/j/. 6. (A - x)/(# - x) - [A/B)^ a ^ b \
6. Let x represent amount of substance dissolved after time f, A the amount of sub-
stance present when x ** 0, t «* 0, and c the proportionality constant. If v is the
volume of solvent and S the saturate concentration, then dx/dt ** c(A — x) X
(S — x/v) if the dissolving substance docs not change the volume v of the solvent.
Section 6, Page 18
1. sin”- 1 x — sin™ 1 y *» c.
3. 2 cos jy — sin x cos x -f- x «** c.
6. (tan™ 1 y — 2VT -f x) I * 0.
1(0,1)
Section 7, Page 20
‘■[K ,+ ?)]’
3. sin - *f log x
x
c
x
2. ( y — J)/(y + 1) 588 cc T ®.
4. (sec x — tan y)| » 0.
*( 0 . 1 )
6. ( y + l)/(x + 1) - 2.
2. sin*” 1 - — log x ** c.
x
4. i* - 2* V - y 4 - -2.
777
778
ANSWEBS
*• logy 4 -
6. y - x 4* y log x - 0.
7. 2 tan”* 1 (e*0 4* log tanh ;
*
9. * - cs*^.
11. log x 4 «~ v/ * « c.
Section 8, Page 22
1. a* 4 * 4 y ■ c.
4. xy « c.
7. Not exact.
10. Not exact.
Section 9, Page 28
7. y/z 4 x — c.
8. y - ee~ 2Vxlv .
10. — — ~ — log y « c.
x V
12. y(2 - log y) = ^ tan 2 x 4 c.
2. Not exact.
5. sin (y/x) *» c.
8. x 2 4 sin xy »
8 . x — eye 11
10. y — 2 tan - " 1 (x/y) « c. 11. (x/y)e v
Section 10, Page 26
1. 1 + Vac* + 1 - cxe-*' 1 ' 3 *'.
a. y - e~**(x - 1).
5. y « cos 2 x 4 2(sin x — 1).
8. y ■» sin x 4 ce*.
10. x - 1 4 ce"^ 2 .
12. y -
12. x sin"" 1 x 4 V7 - x 2
Section 11, Page 27
1. y * cie 8x ; y * cyT*.
p 4 — 2
S. y ~ — ; x - 2p
p
3. x 8 y — xy 8 * c.
6. x 2 4 y 2 ** c.
9. x 2 y 4 xy 2 4 x
9. yx* - ce v .
12. y 4 x 2 /y * c.
o
• V “ 3(x* + 1)'
4. y - 1 - 2e“*'*
2 ,
6. y » 2 sin x ~ x cos x 4 - cos x 4 *
9. x - ce w,) ' / * = *.
11. x(l + 4y 2 )* - c.
- - ]) - o.
2. y - x + cj; y - x +
4. V - log (p* + 2p); x - - 1 tan 1 ^ + c.
(^X 4 1 f~ .
6. y «* ; y *» 2 Vx. 6. sin“ A y d= x « c.
c
7. y * cs x ; y « c — x 2 . 8. y - e* 4 c; y ■* c — x*/2.
Section 12, Pages 28-29
1* y 4 ** (48x“~ 2 — 96x“* 4 — 4) coe x 4 (16x _1 — 96x~ 3 ) sin x 4 cx” 4 .
2.f 2 -*4«4 c#\ 8. y“ B - M* 3 4 cx*.
4. x «■ y log cx. 6. y~ l ** 1 4 log x 4 cx.
6. y~ 3 — 1 + x* + ce* 1 . “ + ” -
8 . 3
ax
7 » — — ;i-u~3,y-p4i
aw 2w 4 tf
■ ; u * 3x 4 y 4 7.
. u — 0
> srn — ■ — ; x
w 4 »
u — y ** v —
2 2
ANSWERS
779
10 .
du
— I m cos u; u W x 4- y.
1 y a 4- cy.
dx
12. 2x
14. x see y ®» log | sec y 4* tan y | 4- c.
11. x~ 2 + «**
16, tan"* 1 y — tan“
19. - c - x.
23. y - e 8x + ce 2x .
x ** c. 17. 4* y 4
13. y * log (ary - c).
16. y tan" 1
** c.
20 .
24.
- y 4- H 4- cc 2v *
2x 2 c w + cx 2 .
x » c.
18. xsin2y
21. 4x « 2y - 1 4* cc~ 2v .
Section 14, Page 33
1. y » cx, 2. at 2 — y 2 »» c.
8. x 2 4- ny 2 « c. 6. 0 * c.
9. Self-orthogonal family.
10 . x 2 — 2ax 4- y 2 ** 0; 2xyy' 4* x 2 — y 2 ** 0; a family of orthogonal curves is
x 2 - 2 ay 4* y 2 * 0.
Section 16, Page 34
1. (a) y * cx; (6) x 2 4- y 7 «* r 2 .
2. (a) o ± x - VT- y* - 1«K 1(1 + VT - j 7-)/y}; ( 6 ) y - re'.
3. (a) y ** c x ; (b) y = cosh x.
4. - c(i + Vx^+'^r 6. t - |x 0 |6/(fc 2 - a 2 ); hi -To 1 - «.
Section 19, Page 46
1. t » 100 /y sec.
6. tan 6 ** tan 0 O 4* 2eE/{(*>mvc).
Section 20, Pages 49-60
2. v » V2y/i, s « sin 0.
4. t> - Vj(l - - 1
6. y
2vq cos 2 a
x 2 4- x tan a.
e 2k»g/v> *
Section 21, Page 64
2. y » — e x 4- e 2x ; y » 0. 4. y «* — 2x 2 4- 4x.
6. y °» e 2x 4“ 2xe 2x ; y « 0.
Section 22, Page 66
1. y ® cie"" 9 ® 4* C2€ 6x .
4. y * cie 2 * 4" C 2 e~~ 2x .
6. y «* cic 2x 4* cjxe 2 *.
2. y ** Cje 3i 4- C 2 C 2x . 3. y » cie* -J- cjxc*.
6. y * rj cos 2x 4~ C 2 sin 2x.
7. y ® cic 2r cv)8 x 4* C 2 « 2x sin x.
Section 23, Page 68
1. y *® ci«“ x 4- C 2 e x/2 . 2. y
4. y ■* cie ?x 4- c 2 xe Sx 6. y
Section 24, Page 63
1. y m d« 3x 4- C& 2x 4- V 4 *-
3. y °« cje“" 8x 4“ c#"" 2 * 4- MV**
6. y « cie* 4* C 20 "~* — 5x 4- 2.
7. y •» (ci 4“ C 2 x)e x 4- xV/6.
9. y • ci 4- 4* x/3.
rje x 4- ^ 2 C x . 3. y ** cie 2x 4- C 2 e*“*.
cie“ x 4- c 2 X€” x .
2. y « (ci 4- c 2 x)«- x 4* x ~ 2.
4. y *=* (ci 4- csx)^ 4- x -h 2.
6. y - cie x 4- C 2 <r x 4* c 2x (x/3 ~ K).
8. y «* cje^ 4* cjxe 8 * 4- xV x /2.
780
10. y
11. y
13. y
H. v
16. y
18. y
20. y
22 . y
ANSWERS
9x 2 - I8x -f 7
«* ci sm 3x 4 ca cos 3x H
81
« cie x -f cje*"* -f xe x /2. 12. y *
» Ci« 8x 4 esc 2 * 4 c 2x f x 5 — 3x 2 — 6x
/X 3 x 2 \
■» C1«* + Cj £«* +
( 6 ~ 2/
16. y
* 3
13x 2 24x
- Cl + — - -
25 + 125*
17. y
5
- 1 - e~ x + x 2 -
X.
19. 1/
- (3i - 4)/9.
21. y
* 2e“ x - 5<r s 79 4 x/3 - %.
d sin x 4 ca coa x 4 x 8 ~ 5x.
)■
cic 2x 4* cac 8x 4
2x 2 4 6x 4 3
-4e~ x 4 2*~* c 4 2e*.
0.
0.
Section 26, Page 66
1. y « cic x 4 C 2 € 2x — (3 sin 2± 4 cos 2x) /20.
2. y «« ci sin 2x 4 C 2 cos 2x — (cos 3x)/5.
8. y ** cjc 1 4 C 2 «"~ x/2 4 2 sin x.
4. y « ci«" 2x 4 c*r Sx 4 3xe“ 2x 4 c s 730.
8. y « 6“ x (ci sin 2x 4 c 2 cos 2x) 4 e x (>£ 0 sin 2x — 34 0 cos 2x).
6. y « cie 8x 4 — 34« 3x cos 3x
7. y * cie~ 6x 4 c 2 c“ 6x 4 xc Bx /10 - x 2 /25 4 4x/25 - % 2 &.
8. y » ci sin x 4 c 2 cos x 4 ?& cos 3x — sin 2x.
9. y *■ — c®/4 4 e~’*/4 — 34 sin x.
10 . y * 0.
11. V * -34 4 %oe 2x 4 c” x (Ko cos x 34 sin x).
12 . y - — cos x 4 (x/2) sin x 4 1 .
Section 26, Page 70
1. V
2. y
8 . 2 /
4. y
6. y -
6 * 2 / !
7. y-
8* If ■
10. y -
1 ci 4 C2 e~ 3x/2 4 cse 8x .
1 ci sin x 4 c 2 cos x 4 c~" x (c3 cos 2x 4 C4 sin 2x).
C\e~ x 4 c&xT* 4 C3X 2 e~ x .
cie~ 2x 4 czc x sin (\/3 x) 4 cge 1 cos (\/3 x).
(ci 4 C2x)e x 4 c 3 .
(cj 4 c 2 x 4 cgx 2 )er r 4 c 4 .
1 ci cos kx 4 c 2 sin kx 4 C3 cosh kx 4_f4 sinh kx.
■ ci 4 c* /2 ^c 2
t e x/Vi ( -
Vl5, 1 . Vl5
cos — - — x 4 C3 sm x
2 2 >
4 — 4 - 4
■(
ci sm -
2
^ 4 e~ x/N/2
4
V2
2
Q sin x 4 C4 cos
VS
4 2 cos x.
11. y » ci<? x 4 c#? 2 * 4 c 3 xc 2ar - x 2 /4 — x — *34*
12. y « cie""* 4 esc* 4 4H 4 xe r /2.
18. y - 4 17c x /12 4 7e~~72 - e~ 3 74 - 2x* 4 2x - 5.
Section 27, Page 72
1. Dependent.
4. Dependent.
7. Independent.
2. Independent.
6, Independent.
8 . Independent.
6. Dependent.
ANSWERS
781
Section 28, Pages 78-78
1. (a) y ■» (a? 3 — v?)/Z — 2x/9 — (6) y « e x /12; (c) j/ « x -f 2; (d) 3/ «• ain
3. y «* — (a^logaO/O. 4. y *» cjc* 4- c?x -f a: 2 4 1.
Section 29, Page 77
8 . v' ~ c/[x 2 ( 1 - x 2 )].
Section 80, Page 79
1 . V “ W' 2 + cxx" 1 ± % log x -
2. y *** cix 2 4* C2X <5 " , “ v/21)/2 4* cgx^™^ 21 ^ 2 —
3. 2/ - c^i+vT*)/* + ^(iWTo/2 + *2/3.
4. j/ « cix 2 4- c 2 x — xl(log x) 2 /2 4- log xj.
5. t/ » c x x 2 4" c 2 x 8 . 6. y « c x x n 4- C2£~ n ” 1 .
Section 32, Pages 85-86
1. y ** 2 cos \/T0 *, ; ?/ * 2 cos VT0 < 4- \/i0 sin VTO
2tt
3. ?/ « 10 cos \/245 t.
5
4. j/ » 10e“ w (cos >/2^0/ 4 siri \ r 22i)t)\R « 400^245(^68.
V 220
5. V - ie0\/2 e~ 6W1 cos ^500* -^) ; V’ « lOOf' 500 ^! + 500\/2 0-
6 . K - 20^ puih 10,000-s/o t + \/5 cosh 10,000\/5 ()•
d 2 W
10. 10 + 10 gy * 0; max y « \/3, total drop 2 4 \/3.
Section 36, Pages 99-100
1. y *■ Ci cos t 4- C‘i sin t; x * ri sin * — c 2 cos
2. /y » Cic* 4- e 2 e~ / ; x » ae* — rtf'" 1 .
3. y » o f (ri 4- C2O; a* *= e jr (r x 4* c 2 /2 4* c 2 /)-
4. y = rie* 4- cgc"* 4- 03 cos 1 4* c 4 sin f; x «* fic* 4“ c 2 e"* — C3 cob £ — C4 sin £.
5. y °* Ci(l 4 \/2 )e y/2t 4 r 20_j~ >/2 )c v/2< ; x = cic^* 4 c 2 e*" v/2 ‘.
6 . y - cje^ 8 4 ^ _4 4- Hi
* - r, + c 2 (-- I . 4 17 ) fW-VTi)! +
9. Cycloid of radius mE/(eH 2 ).
Section 36, Page 106
1. yi » c x e~ x 4- c 2 e 8r ; 2/2 * 2(r 2 e 8 * - cie~*).
CHAPTER 2
Section 2, Page 118
3. 1(c), 1(c) for x « 0, =fc?r, d=2*-, . . . ; none in Prob. 2.
4. 1(6), 1(c) for x 9 * 0, =fcw, db2w, . . . ; 2(a), 2(d).
5. 1(a), 1(d); 2(6), 2(c), 2 (/).' 7. (a) Yes; (6) no.
ANSWERS
Section 3, Pages 121 -122
1* div, «““* i» r~~» div.
3 log 2
2. Con, con, div, con, div.
3. (a) c < 0; ( b ) con for c < 0. 4. c m < n < e im .
H < * < %* 1.08190 < s < 1.08267. One term; eight terms.
Section 4, Page 124
1. Div, con, con, div. 2. Div, con, div, con, con. 3. Con, div, con, con.
4. (6) No. For example, «n *■ R» b n «• n, c n «■ 1 — n, d„ « 2 — n.
log (9 c)
3. JV > - r— ♦
log 10
Section 5, Page 127
1. Con, div, con, con for |x| < \/5, con. 2. Con, con for c > 1 only, div.
3. Con, eon, con, div.
Section 6, Page 132
1. Cond con: div, abs con, abs con, abs eon.
2. Abs con: |x| < 1, all x, |x| < 1, |x| > 1, — ^3 < x < 4, x *»0, \x — 2\ < 1, all x.
Cond con: x 1, never, never, x = —1, x » never, x «* —3.
8. 0.95.
Section 7, Pages 137-138
3. Unif con for (a) — °o < x < »; (b) |x| < c < 0 1; (c) | (2/n)x — n| > c > 0,
wher& n is the odd integer nearest to (2/w)x; (d) 1 < c < | x ] < 00.
4. Unif con for (a) — « < x < (b) \x\ < c < 0.1; (c) 2 x/r ** odd integer or
{ (2/r)x — nj > c > 0, where n is the odd integer nearest to (2/r)x;
(d) |x| > c > 1 or |x| < c < 1.
6. Yes, no, no, yes. 6. No.
Section 8, Pages 142-143
1. Con for — 1 < x < 1, — \/2 < x < y/2, all x, — 3 < x < 3, — \/3 < x < v^3.
5. (b) tan-' x - £ 4. (rf) 2.72, 0.368.
S. ( b ) tan -1 r •> 2 ~ 2 2n ~ fl .
6. S(-l) n — — , E(-l)"- 1 —
4ft -f-1 o (it 4“ 3) (5n. 4- b)
E jl±i x8 .
Section 9, Pages 146-147
, 2"(x - 1)"
x 2n+1 7r 2n+1 fx — l') 2n + 1
— — r 2n + l =* ~ ii •
WZ{ 1} (2n + l)! ( J (2n + 1)! ’
Jin (x l) 2n
<’> ~ : ■«-'>■ «si + “ 1 £ <-»- sttwi - ’-“'ir
ft)2 + *’ -3+ 2ft- 1) + ft - l) 1 ;
«£*=!?■
i. ^jS-" ; w s(-«-
ft! ft 4* 1
ANSWERS
783
» (a) 2( — l) n (x - 1)*; (b) - jZ (i + ~^r) (* - 1)»;
W-gZ(l+2(-l)*-^) (*-«».
*■ (' - if + 5 ^ ranTijr (* - ;)*
x 2n .~4n+2
6.S-.)* - r ,X ( - 1 )- s _ n5j ,
r 2*+l
2 . V
^(2n)I ^(2n + l)!
2(~D n
n + l‘
Section 10, Page 149
1. (o)S(-l)"
(c)2(— 1)"
3. 2(— 1)’
^n+l
(2n *f l)n!
~4n+8
;(b)2j2
(4n + 3) (Sin 4- 1)!
x p+n
for p > 0 and all x.
(2 n -f l)(2n + 1)! '
r 2»41
; (d) 2(— i) B -
(2n + l)(2n + 1)!
n!(p + n)
1 f («? ~ 1)(<? ~ 2)
' J> i n!(p + n)(-l)»
(9 ~ «)
j > 0, p ^ 0, —1, —2,
Section 11, Pages 162-158
2. (a) a: + 4- Ks** 4 — ; (6) 1 + x + M* 1 4 — ;
(c) l 4- 4- M**‘ 4-- • •; (<*) H + H* - Hs** +■■■;
<e)l-X*-Msa? +•••;(/)!
8. (a) 0.00133. 4. 4- «** +• •
6 . 3.004, 0.986, 0.839, 2.036. 6. 0.310, 0.020, -1.025, 0.94.
7. |«| < 0.24 radian - 14°. 9. H** - 34 2 * 7 -
Section 12, Page 166
, 2 ”
1. Z-fX”, 1 +x 4- 2 1.
'n!
3. sin
ni
.-1 -8 — ( 2 » — 1 ) 1
2*4 • • * 2n 2n + 1
2. y - 1 + a: 2 + Hx*', k « 2.
r 2n+l
Section 13, Page 169
X 2n X 2»*U
1. (a) ooS( - l) n 4- 0 iS( - l) n ;
x } ( 2 n) 1 “r®X V ) Qn + iy.'
(b) 1 -f «o cos x + «i sin x;
l-4x 6 , 1-4-7* 9
(c) 4 + h
2. (o) e*; (6) x - 1.
61
9!
+ •
+...)+(
M'
X 2i' 2-5x 7 , 2-5-8 1 *
ni i! + ir + ~7r + ~ior
21 + 61 + ‘ 8! ‘ 111
8. Ci2(-l) n — 4-e»2* n .
n!
)•
Section 14, Pages 166-166
6. On ** —
1
(n 4- p)n
On-*.
7. (b)J—(ci
> irX
(ci cos x -f c* sin x).
ANSWERS
784
Section 17, Page m
2.
sin (n -f* l)x/2 sin (ruc/2)
4. 12*.
sin (x/2)
X 9* 0, ±2r t db4ir,
Section 18, Pages 182*18$
o «6 /_ iNn-l 0© / _i\» «o
*•-*■ + £ ^ cog (2n - l)z + E ^r~ sin 2nx + E ( '
8' ‘ i 2(2n — 1)
Section 18, Page 187
1* e, e, o, e, o, neither, e.
Section 20, Pages 191*192
e 4 V 8in ( 2n jl jXg/gjf
* TO 2n + 1
4. COS rX.
4n
1
2 T ^ 2 (2n * 1) J
Section 28, Page 204
4 4
2. a x - * ; os ■» 0; a# * —
r 3ir
CHAPTER 8
Section 1, Page 219
(a) Entire xy plane; ( b ) entire xy plane; (c) y 2 < 5;
(d) ^ + y 8 ^ 0; (e) i ^ 0; (/) (x - l) 2 + y 1 < 1*
Section 2, Page 222
t <«) \ :(&) ** - rdbr.’ l! +
X* X ' ' ' X 2 + y 2 '
(c) y cos xy -f X, x cos xy; ( d ) e* log y, e*/y;
1
(e) 2xy -f
vr
2* (a) 2xy — **, x 2 + i y y - 2x*; (6) yz + -» xz *f -> xy;
x y
t —zx
(c)
»
(•)
*> sin"
Vy 2 — x 2> yV y 2 — x 2 * y
x y t
Vx 2 + y 2 + « 2 ’ Vx 2 + y 2 + 2 2 V^x^+'y* +'«* *
— x -y -2
(x 2 4* y 2 *f e 2 )* 2 ’ (x 2 + y 2 -f 2 2 )^' (x 2 4- y 2 -f « 2 )^*
Section 8, Pages 227*228
1. ir/6 ft 8 .
4. 2,250.
7. 0.112; 0.054.
10. l.Oir; r.
l)«*i ~ ain nx.
2n
cos (2n — l)irx.
2. 11.7 ft.
5. 10.85
8. 53.78; 0.93.
8. 0.139 ft.
0. 88.64.
9. 0.003r; 0.3 per cent.
ANSWERS
1
786
Section 4, Page 280
1. aa*/a 2 4 yyo/h 2 - 1.
i ay + V2 a6.
8. (o) «<* (2t an ^ ooe
(6) 2r(l — 3 tan 2 0), —6 r 2 tan 5 sec* 0.
6. (a) 2x, 2(x 4- tan x sec 2 x);
p (
Section B, Page 2S6
sec y -f &cV
2. *(«§ — ajfo) -j- y(y8 — aate)
/JA M *V, . dV f dV . BV\ dV
(b) cos 0 h sin 0 — » r cos 0 sin 0 — ) > — •
dx dy \ dy dx / dz
1. (a) y' - -
<»*
2
x sec y tan y 4 2x 8 y *
Sx^ 3x 2 y
cos * — 3* 2 * ^ ** cos z — 3a 2
!. ? i ? ( I v5 i T?g-v 5)iy| 7 (» v ^+?s + *D !
B. du ■■ 2x dx 4 2y dy «• 2r dr.
g*1/ g*V
e. /„ - ~2 ~ — 2 + *»)>/« “ “=— -r
r 4r uHr
r — ux).
Section 8, Page 246
1. (*■ 4 1)/V5
8 . H[S y/s 4 1 4 e(l 4 VS)] * 6.811.
Section 9, Page 249
1. a/3, a/3, a/3. 2. 8a6c/3 y/%. 8. a/3, 6/3, c/3.
4. y/SP/( 2 VS 4 3), (VS 4 1)172(2 V3 4 3), P/( 2^ 4 3).
g. I m h - ~ VoOr^F, d - V6 «.
Section 10, Page 264
8. (a) 106°46', 90°; (6) 164°16', 90.° 6. d/Va* + 6* + c*.
Section 12, Pages 260-261
i. j + (j - ft + («• - m + («• - 1 )kk + k*
+ S*‘ + T*'‘ + ( ,+ i)"* + i‘ , + ' • ■
where & X — „1, k » y — -
2
8. « |l + (* + *) + [ft* + 4ta + fc»] + • • •}> fc - * - 1, it - V - ]
*• 1 + x + (** - y 1 ) + i (i* - 3xy J ) + jj (* 4 - 6*V + y 4 ) + • •
Section 18, Pages 268-264
4 irsin (rar/2) t COS (rnar/2) — 1 A
X * £ +~ ? * “**
axm-
786
ANSWERS
S.«g-l°g2).
8. 2x*.
4. — tan a.
7. air (a* ~ 1)~*
Section 14, Page 869
d
*• g (py') - n - S - o.
Section 18, Paxes 379-277
3 . u*vdudvdw.
3. «a»(r/2 - %).
1.
8. 32o‘/9.
Section 17, Page 881
JL Tt?/2.
4. 8a*.
3. ududv.
6. x(l - e _ °*).
2. 4a*(x/2 - 1).
5. t ~ a cos* (o/2).
CHAPTER 4
Section 2, Page 291
2. A + B + C - 0. 8. A - i^(S + D), B - H(S - D).
8. f°> TaJ ; <b) 2 (jaT ^ TbT )'
Section 2, Page 298
1. (a) 6j, — 5j; (6) A + B - 21 + 3j + 4k; (A + B) + C - 31 + 3j + 3k;
B + C ■» 2i + j; associative law;
(e) 51 + 10j X 15k, — 2i — 4j — 6k, 3i + 6j + 9k, 31 + 6j + 9k;
(<J) 3i + 6j + 9k, 3i + 3j + 3k, 6i + 9j + 12k;
(e) -4.
Section 4, Page 294
1. (a) 10, 2, 8; (6) 6, 4, i + 3j + k, 10; (c) 12;
(' d) cos -1 3/V2I; (e) 4/>/5; (/) s - 4; (g) -i - j + k.
2. (6) x — —20; y «■ 8; * — 1.
Section 6, Pages 296-297
1. (a) — 2i + 3j — 4k, 51 -_4j + 3k, 3i — j — k, 21 + 3j + 3k, 81 - j - k;
(e) 131 + 2j + 2k; (d) y/l 77/2.
Section 6, Pages 298-299
2. (a) 0; (6) * - %; (d) 0, 0.
Section 7, Pages 301-302
1. (a) R'(t) - 21+ 6tj + 3t*k; ( b ) R'(l) - 21 + 6j + 3k;
(c) v - 2i + 6 j + 3k, |v| - 7.
8. (a) v - R'(<) - 1 + j cos t - k sin t; |R'(«)| - -y/5;
(6) s - 2VS.
Section 8, Pages 306-306
1. (a) W - 0; (6) W - -2; (c) W - 4.
2. (o) T - -i + 2j - 3k; (6) T - 2i + 4j + 4k.
ANSWERS
787
8. («)▼■> AsA x B; (6) t •» kC X (A — B).
8. (a) R - XV* + 5j); (6) R(l,2) - H(4i + j).
Section 9, Page 808
1. (a) n - 1 + 2j + 3k; (6) cos" 1 6/V32; (c) 9/VU.
8. (a) i + 2j + 3k; GO * - 1 + <; y - 21; * - 1 + 3/; (e) V5?;
(e) *£1 + j + (/) Ml - Hi + Hk.
8. (a) R - 6i - 2j + (— 4i + 4j - k)t; (6) -41 + 4j - k;
(d) -Ax + 4tf - * + c - 0; (e) R - -31 + k + (-41 + 4j - k )(.
4. (a) & - 31 s + 8; (b) t - 0; D - 2\/2.
8. R - (i + j + 3k)i, (-«, -*).
Section 10, Page 811
1. n « oi -f 6j + ck.
2. (a) ~i + 6j 4- 2k; (c) 2! + j + 3k + (-i 4- 6j + 2k )t - R .
4. 0 « cos" 1 9/VTO^.
6, — 16i 4- 8j 4- 4k.
Section 11, Pages 815-816
1. («) (-1 + j - k)/V3; GO -* + (» + 2) — (* — 2) - 0;
(c) -y/3 + log (1 + )•
2. (o) v « 21i + 2j + 21k; A - 21 + 2k; (6) v - 2V / 21* + 1;
/x V4< s + 2 „ 1 — 21 j + k
w * ” om,2 , "re* ; N “ -rrrr-rr-
2(2 i* + l) 1 ’
8. Let R(l) “* ( 02 I 1 + fljl + ao)i + (& 2 I 2 + bit + &o)j 4" (cal* 4* cil 4* co)k; then wa,
equation of the plane through the plane curve is
(bic 2 - Mi)(* ~ Oo) -f ( 02 C! - aic 2 )(y - 6q) *f (Mi ~ Ms)(s - 4>) - 0.
8. (a) T - (1 4- 2j)/v/5; N - (j - 2i)/V5;
(6) V = i + 2j; A - 2j; (c) F ( - V5; A, - 4/y'S;
(d) s' - Vl +41* ; s" - 41/V 1 + 41*.
8. (a) A n - 2/V5; (6) * - 2/(5\/5 ).
CHAPTER 6
Section 2, Pages 866-867
2. 0n * 1, 022 “ p 2 , fits “ p 2 «in 2 0, 012 " 023 *■ 013 ** 0, where p «* %\\
0 « X 2 ; 0 « X,.
Section 8, Page 872
1. At (1,2,3), Vu * 2i -f 4j 4- 6k; du/dn - 2\/l4;
At (0,1,2), Vu - 2j 4- 4k; du/dn - 2 VS.
2 . (a) ~(ix 4- jy 4- kzKx 2 4- y 2 4- * 2 )“*;
(6) 2(ix 4- jy 4- k*)0c 2 4- y 2 4* * 2 )~*.
8. du/dn * -3/V6.
6. n « HO -2J4- 2k).
8. du/dn «• — 3.
Section 4, Pages 877-878
4. du/dn ~ -7/ VS-
6. H(2i - 2j - k), H(-2* 4- 2j 4- k).
9. dv/ds - 6/ VS.
8
3. (a) Hi W -X; («)
788
ANSWERS
4. Helical path, t 2 /8 — 1; rectilinear path, ir 2 /8 — X.
(a) W m -K (6) Wm-g.
Section 5, Page 382
1. %■ 2. u{x,y) - xy + Hv* - K*?-
4. (o) « - xyz; (6) xyz. 5. 0; u - log r.
6 , 0 ,
8. & +v* - (*? + v? + 4)- H .
Section 7, Page 888
1. (a) 3; (6) 2/r; (c) 0. 2. 0, 2/r. 5. Su, 0.
Section 8, Pages 390-891
2. iro’t. 8. 4ira6c. 4. 4 ra 5 .
Section 9, Pages 395-896
1. (o) v"374; («) i%.
a. (a) ▼
(&) v
»(** + V s ) + j2®v; u - **/3 + iv 1 ;
1 - V* ■ , 2/ ■ 1 y 2 - 1
(!+*)* (1+ x )* J ' “ “ 2 (1+ *)* :
(c) y •> iy cos x 4* j sin x; u — y mn x;
(d) v - »*v(l - z 1 )-* - j(l - z*) H ; « - — v(l - z 1 )*;
(e) v - i(z + 1) + j(v + 1); u - M[(* + l) 1 + (v + l) 1 ].
8. (o) 2ir; (6) 2 t. 4. 6ir. 8. (J r ,
Section 10, Page 899
2. (a) 0; (ft) 0; (c) 0.
Section 11, Page 402
2. —t. 3 . 0. 4. o.
Section 12, Page 406
2* w « j (xy - Hs?) 4- k[*(* + y) - K(* 2 4- y 2 )J.
8. -u — xy 2 4* x 2 * 2 — x.
4. w « j(xy* 2 - Hxh) 4 - k(- + *V).
6. No.
Section 18, Page 408
du . •idu du
1. Vu-r l - + 7 - + k-.
a. v 1 - - — + ■ 1 3 | L_. a * ■ 1 ** , l a*
(u s + t^)u du (it 2 + t?)v dv u 1 + t?dv? li S + B 8 *! 1 + uVa^. 1 '
8. div F *■ — 3p cos 0/r 4 , curl P « 0.
Section 16, Page 414
2. Irrotational. 4. * - z 1 - y*, hyperbolas.
8. Irrotational and solenoidal. 10. — 3z*y; * - z(z J +
ANSWERS
789
CHAPTER 6
Section 1, Page* 429-431
2. (a) [1 + (x - oO 9 ]" 1 , -2(z - ofl[l + (x - erf)*]-*;
(b) 2(1 + a¥)'\ 2(1 + **)->, (P - 2i + 2) -1 + (P + 2i + 2)"*.
3. U m U — U K Uy m 0.
7. (a) fiiv + »!*), /*(» + mtt).
9. (o) Pi(y - ax) + Pj(y 4- ai); (b) Fi(y - 2x) + F t {y + x);
(c) Fi(* + iy) + F j(x — *'y); (d) F i(y + x) 4- xF»(y + *).
10. (d) Ft(y - fix) + F 2 (y + x) - y */ 60 + x«/6.
11. (a) ~ + Fi(2y -f x) -f F*t(y — x); (6) ~ + Fi(y ax) -f F 2 (y -f ax);
(c) 4- K 2 I/* + Fiiv - 2s) + F 2 ( 1 / - x).
Section 4, Page 440
2. u(0,ir) *» sin 2ar; t^(0,ir) «** ^ cos 2ax.
a!(1
3. a «* =fc — — H — — - > n «■ 0, dbl, =fc2, ....
15 5
4. Const + (2a)~ 1 e~’ t . 6. ae~*( 2x* - 1).
Section 5, Pages 445-446
1. - 1 , ±21, ±31, ....
Section 6 , Page 449
2 . (b) 6/2.
Section 8, Pages 464-465
1. 0.45 oscillation per sec.
Section 9, Pages 458-459
1. 2.07 X 10 6 cal /(m 2 ) (day).
4. (c) ^4 1 1 ~ e“ <a " /i),< ].
3 . («)®E <-
* n-0
1 )"
(2n + 1)*
^ (2n 4* 1 )tx
l
Section 10, Pages 462-463
4. (6) u(x,t) — 2c n e“ Ia(2n ~ 1)/2QS *sin7 (2n
l
l)x;
2 r l
c« « T / fix) sin - (2 n — l)x dx.
t Jo l
;/>>«
/:
_ „ nrat . nrx
7. Za n cos — sm ~J~ * °n ■
a vi « nir <^ , nrx
8. 2*6» sm — sm ; 6 W «
l l
Section 11, Pages 466-467
3. 0.44883, 0.14922, 0.00004.
r , x 400
6. u(z,y) m — 2J
nrx ,
sin -y~ ax.
2
nra
0(x) sin dx.
— i— - e -o«-i)»*/io gin (2n _ d If.
2n - 1 ' to
790
ANSWERS
6. u(x,t) “ ^ 2n ~ — j sin (2n — 1)^2-
7 . 36 . 5 , 41 . 9 .
Section IS, Page 471
n
^ 2t Jo r 2 — 2r# cos (0 — <*>) -f # 2
i2 2 - r*
i* 2 -r 2
o L# 2 — 2#rcos(0 — «*>) 4* r 2 R* — 2i2r cos (0 + <*>) -f r 2
2t r 2 - i? 2
Md*.
]/(♦)<*.
Section 13, Page 474
200 ,
1 . U - 50 + — ,
sin (2n — 1)0.
d<f>.
(2 n - l)** 2 ”- 1
60 f T a 2 - r 2
2 . U ** — /
sr Jo a 2 ~ 2ar cos (0 — <£) -f- r 2
*• U “ V Z (2n~l)a^ r2 " -1 8m (2n - 1)9 '
4. u(x,y ) * 2an sin r?u sinh iray; a n =* -7*“ — / /(s) sin *nx dx.
smh %n Jo
Section 14, Page 479
l\ *
Section 16, Page 482
1. £ Jo(frnO COS m ^ Z(A mn COS « mn * + Bmn Hm O&n - 7*
2. JZA n e~ a!lk n t Jo(k n r) t where 1 = 2A n Jo(k n r).
Section 18, Page 490
a. («A)-* (W— > cosh <a.
Jo 2ort
S. (iraH)~ H f fU)e- ixUP>Kia, ‘ ) ainh~dt
Jo 2ort
4. (4«A)~ H jf /(*) j £ [e~ (l_2 ’ li_,),/<4a ’ 1) ± e -(*-l» , +.)’/(4o s i)]| dj _
Section 19, Page 493
3. u(*,j/,0 - (4* ^at)- 1 f f e -[(*-» 1 ) , +(v-n) , l/U« , iy( ;Cl)l/l ) ^ dyi
j — <30 J-~<X>
4. u(x, »,*,() = ni7-Tii5
fivuh)
6.»(x.„,0 -fj_" W(t _ (i)
«(*,<)- fjI" Wr mH dti.
</o [4ira 2 (t - ti)J^
ANSWEBS
m
Section 24, Page 607
1. |fc|<2, ~2, >2.
2* Respectively m, in or outside the unit circle x* 4* y* *■ 1.
5. (a) Ice v + (1 - c)e~*] sin x.
Section 26 t Page 512
2, At the lattice points in the region y > x t x > Q; y > h — & f z <i 0.
S.
Section 27, Page 514
1. 19.
2 . 2 .
S. 2.
Section 28, Page 517
12
' RL
1. 7(x,Q -
MCUA).
-(llROinrfiyU *** ,
<50 ir n-»l n 100
». / - 0.6 + 1.1 £ (-1)" COB /0 ‘.
»-i 1,000
Section 29, Page 519
3. t/„ - 0; Urr - 0; l/rr + ^ - 0.
CHAPTER 7
Section 1, Page 588
1. (o) 2, x/3; (5) 2>/2, */4; (c) 2, *; (d) 1, 3ir/2;
(e) V2A 7ir/4; (/) 1, */2; (y) fc, r/3; (A) 4,
2. (a) —8; (6) -1 + 1 ; (c) (2 - V5 )/2 - /(2 + V3 )/2.
8. (a) 1} (b) 1; (c) 1.
4. cob <v/6) 4* * sin (r/6), cos (x/6 -f 2 t/ 3) -f i sin (r/6 -f 2r/3),
cob (ir/6 + 4ir/3) + i flin (ir/6 -f* 4x/3).
7. (a) 1, JS<-1 + ), Hi -1 ~ tV3 ); (6) 1, t, -i, -1.
16. (a) Circle x 2 4- y 2 * 1 ; (&) circular region x 2 -f y 2 < 1 ;
(c) region exterior to the circle x 2 4- y 2 1 including the boundary.
18. (a) Circle radius 2, center at (1,0); ( b ) circle radius l/\/const, center at (0,0).
Section 2, Page 585
1. (a) (x 2 - y* - X -f 1) 4* »(2xy - y); (6) x/(x* + y 2 ) - iy(x* 4- y 1 );
(d) (x 2 -f y 2 - i)/[ar» 4~ (y 4- 1) 2 ] - ^/[x 2 4- (y 4- l) 2 ];
(/) x + t'2y; (g) (x 2 4- y 2 )“ l .
2. (a) Open region x<3, — »<y<<»;
(6) The region y>l, — »<«<*>;
(c) The region exterior to the circle of radius 1 with center at the origin and in-
cluding circular boundary;
(d) Circular ring centered at the origin with interior radius 1, exterior radius %
including the boundary of the inner circle;
(«) Open circular region with center at (1,0) of radius 1;
792
ANSWBBS
if) Closed circular region of radius 1 with center at *o ■" *o 4 tjfo;
(g) Open region exterior to the circle of radius 2 with center at (0, — 1).
Section 8, Pages 989-540
*. («) M(« _1 + *) coe 2 + - e) ein 2; (c) Io « '''»-(./<+«*.); („) «wi+»r
4. (c) J$(« + « _J ) dn 1 + H*’(« — « -1 ) co» 1;
(d) «*^**(ooe 2*y + % sin 2a?y);
(e) e* /< ^ + ^ ) {cos [!//(«* 4 y*)J - i sin {y/(a^ 4 y*)]|.
5. (o) log 4 4- (W log 5 4 i-ar/2; (e) e~ T/a ; (/) «(cos 1 4 i sin 1);
(ff) HU* - «-*).
6. (a)(r 4 2dfc)i; (i>) r/2 4 2r* -ilog(2db y/S); (d)2*k,k - 0, 41, db2,
Section 4, Page 545
8 . Q>) % «■ —1; (c) i «• 0; (d) * — r/2 4 fcir, fc « 0, =bl, ±2, . . .; (e) * — 1, t » — 1
(/)» (y)i W at all points; (i) * «■ 0, (k) z « d=i.
Section 5, Page 548
1. M(2 4 lli).
2. 1 along rectilinear, 1 4 i/3 along parabolic.
5, 0. 6. 0. 7. (—2). .
Section 7, Page 555
2. 2wt. 3. 2, upper half; —2, lower half.
5. 0. 10. (a) 0; (6) 0; (c) -*»; (d) ri.
Section 8, Page 559
1. 2*i(8 - 13i). 8 . (a) 0; (b) 2 t i; (d) 0; (e) 2ri.
6. ~2*i.
Section 9, Page 561
1. u ^ x* — &ry 2 .
2. (o) z 4 iy\ (b) cosh y cos x — i sinh y sin z; (d) e z (coa y 4 i sin y).
Section 10, Pages 564-565
a .f
nZl n
oo -n « / i\«- 1 2 2»—1
*• (0) S »!' * “ (fc) .?x (2n — 1)1 * “ * :
(d) E (-1)— «-l.
«— 1 ft
Section 11, Page 569
1. (o)i + 2 + 3* +4^+ •••;
z
<&) (nb)» + i~i + 1 + - *) + o - *) 3 + a - *)‘ + • • ••
„ . _i , j_ j_ , j__
*» + 2!*< “ 31»* ^ 4!*»
788
ANSWSRS
S. Functions are expressed in Laurent's series.
4. (a) - — i- - l _ (* - 1) - - 1)* ;
Z —■* x
2 3 1+2* 1+2*
< 6) ; + ? + -£- + -^- +
i * ** «* . i
^ 2 2* ~ 2 s 2 4
+ -+-, + -! +
Section 12, Pages 672-673
( 6 ) 1 --,+
2^
2! z 4
I i
3!*<
4 +
, residue
, residue (0);
y v 1 11 z z 2 . . ,
w r* _ ; + 2i _ 3! + 5i~" , ’ res,due( " 1);
(/) « 2 +* + £j +™j + •••, residue (HO;
(if)
2 4 8_
z* 2!z 2 3 Iz
( i ) Residue — at z *
6. No.
16 32s
4 ! 5 !
1, residue }$ at- z »
- •••, residue (~%Q:
1.
Section 13, Page 674
1. (a) ~ri; (c) 2irt/3!; (d) -8«/3; (/) 0°.
2. 2Tt. 4. (a) 0; (b) 2ri.
Section 18, Page 586
1. (o) cos x cosh y, sin x sinh y; (6) e x cos y, e x sin y;
(d) log ( x 5 + y 1 )*, tan -1 (y/x); (e) x/(x* + y 1 ), -y/iz 2 + y 2 ).
2. (c) v « e* sin y — x; (d) 8inh z sin y.
CHAPTER 8
Section 1, Page 612
1. H. Hs.
». H.
6. 81, 71, 71/2.
7. (a) X; (b) 0; (c) Ms; W H-
Sectioo 2, Page 616
l. W, H, H, X, X, H, X . X-
6. Mi-
Section 8, Pages 621-622
1. 33/16,660.
•• Xi, Xl, Xt-
2- Ho, X , Ho, Xs, 1,323/46,189.
4815 ! 13147 ! 47151
‘ 52! ’ 8152! ’ 30 52! '
12 2m + 2n — 4
6 . —t —
n m mn
2. Questions 1, 2, 8 can be answered.
7. X-
2. %, Ha.
4. ‘Ho-
794
ANSWERS
». H, H».
7 . Ho, Ho-
51 5! 18,781
52* < 52 . . . 48’ 13‘
6. » > (log 2)/(log fl — log 5).
„ /3 \* 1211 ... 8
‘■U >
52-51
48
270,840
52-51 -50-49*
10- 1 - j>* n - 3.
Section 4, Pages 026-627
1- 8 p. 2. E(XY) ** 6 pq.
8. (a) 1; (6) 1; (c) 1. 4. X 2 + Hi ~f Xo + * * • + H
6 - 2 .
Section 6, Page 681
1. Hit Hit l %2 1 l %2t Hi, X*. 4.
Section 6, Page 687
1. 0.75, 0,60, F(x) - x 8 for 0 < x < 1, m - 0.794.
2. Xt, 0.206. 8.
4. H - 6- X-
Section 7, Pages 640-641
1. W«, H*> H, 2 and 3. 2. (a) 125/3,888, (6) 2% 48 .
8. (0.65) 10 + 10(0.65) 9 (0.35) + 45(0.65) 8 (0.35) 2 *f 120(0.65) 7 (0.35)».
4. 5(X>« + 4 (H) b (H) + 45(X) 4 (X) 2 + 40(X) 8 (X) 8 + 15(X) 2 (X) 4 .
B. 741/2,728. 7. 0.57, 0.57, - - —
n n
Section 8, Page 644
1. 0.499.
Section 9, Page 660
1- 0-039.
8. 0.979.
5- 0.083, 0.166, no.
Section 10, Page 664
1. 1,540.
3. *X 0 .
Section 11, Page 668
1. (a) 0,368, 0.402; (b) 0.368, 0.373.
8 - 1,005.
6. 0.577.
2. *(1.5) - *(-1.0) * 0.806.
4. 0.0222.
2. 46,413/78,125.
4. 0.91854.
2. 0.758.
4. Expected number » 5.
Section 12, Page 668
2. 0.82.
Section 18, Page 667
1.
8 - X V5 - 0.577.
6.83.
2 . H, H, *H,ye*,P - -l.
ANSWERS
70 $
Section Iff, Pages 070-671
1. 7.5, 0.26. 2. 0.145, 54.
CHAPTER 9
Section 1, Page 679
1. (a) -0.8 < x x < -0.7, a* - 2, x 9 - 4;
(6) -0.8 < %i < -0.7, 1.2 < 3 2 < 1.3;
(c) -0.8 <xi< -0.7, -0.6 < 32 < -0.5, 1.0 < 3* < 1.1;
(d) -0.6 < xi < 0.5; (e) 4.4 < 3i < 4.5.
2. h ~ 1.23.
Section 2, Page 684
1. Prob. 1: (a) -0.75; (6) -0.73, 1.22; (c) -0.77, -0.55, 1.08; ( d ) -0.57; (e) 4.49:
Prob. 2: 1.226.
2. -0.942, -0.200, 1.045.
Section 3, Page 686
3. 2.310 radians.
4. (a) 0.739; (6) 0.667; (c) -0.725, 1.221; (d) 2.924; (e) 1.045, -0.942, -0.200.
Section 4, Pages 688-689
(a) x « x Hb; y - 2 Hb', * ~ -Ha'p
(b) x\ « 1; 3a « — 1; x% «• —2; 34 — 3;
(c) xi ~ -0.107; 32 - 0.988; x z - 0.317.
Section 7, Pages 694-696
1. y « 0.253* - 0.503 -f 0.25.
Section 8, Page 696
2. 9.466, 12.549.
Section 9, Page 700
1. 2.784, 2.700. 2. If % « 60, 9 - 40.82, 42.52, 42.50.
3. 2.581, 2.627. 4. 106.09.
Section 10, Page 702
1. (a) y - K* + H; (5) y - 2.53° *; (c) 0.3(10° **).
Section 11, Page 711
1. y - 4.98 - 3.13x + 1.263 2 .
3. It - 1.778; 6 - 1.9349; 9 « 60.02(0.861)*.
Section 12, Page 715
1 . y ** 0.75 4* 0.10 cos x — 0.05 cos 3x — 0.29 sin x.
2. y - 0.85 - 0.25 cos 2x - 0.05 cos 4x + 0.05 cos 6x + 0.26 sin 2x - 0.03 sin 4*.
Section 13, Pages 720-721
1. 25.252, 25.068.
8. 128.6.
6. 39.30, 38.98.
2. 132.137.
4. 666.25, 666.00.
ANSWERS
706
Section 14, Page 738
1. y(±0.2) - 1; y(±0.4) - 1.02; y(±0.6) - 1.061; y(±0.8) - 1.124; y(±1.0)
*• 1.214. The corresponding exact values are 1.010, 1.041, 1.094, 1.174, 1.284.
*. v - vi + y'(ii)(* - *0 + — (v'ta) - v'fa>)K* - si) 1 -
2 h
8. yi - 1.0100; j/j - 1.0403; y t - 1.0927.
Section IS, Pages 726-727
1. yt - 2.0442; y 1 - 2.3274; yt - 2.6509; y» - 3.0190; y w - 3.4363.
3. y( 0.3) - 1.3498; y(0.4) - 1.4917; y(0.5) - 1.6485; y( 0.6) - 1.8218.
4. 0.2740.
Section 16, Page 780
1. y « x 4 - x 2 + M** 4- H 4- Ra; * - x 4- M** +
2. 0.1, 0.2205, 0.3627, 0.5281.
8 . 1.01, 1.031, 1.063.
4 . v l + + M*? + H2X 4 + Ho**
z m X + 4- H® 3 4* H 2^ 4 4 * •
5. 1.0052, 1.0215, 1.0502.
APPENDIX A
Section 1, Page 747
1. (18, 0, -1, 0). 2. (18, 0, -1, 0). 8. 1.
Section 2, Pages 752-768
1. (a) (2, -1, 1); (6) (1, «, -Jf); (c) (3, -1, 2); (d) (1, r l, -2, 3).
2. (a) (~*/7, S*/7, A?); (5) (0, 0); (c) (0, 0, 0); (d) (l fc/4 , 7*/8, *); W ft 2k, 0);
(/) (0, 0, 0).
8. (a) (1, —1); (6) inconsistent; (c) inconsistent; ( d ) (1, 3 k —2, &).
APPENDIX B
Section 2, Pages 758-769
2. y * coe 2x 4- H sin x — H »in 2x. 8. y - 3^e x — 4- M** — x.
6.if-24«~ 2 *- xe"~® - 3c"~ x . 6. y - e
Section 8, Pages 761-762
1 * V *• fSc"** - 2. y «* 1, y *» x, y - for x > 0.
Section 4, Pages 764-765
1 . \ e“' 2x cosh -7== x — - \/3 «~ 2x sinh x.
3 \/3 2 V%
SL (a) cos x; (b) e~ x cosh x 4 - Mt~* sink x ** ?£ 4 - K«~ 2as ;
(c) e““ H *(co8h H\/6 x — sinh 4£\/6 x).
Section 5, Page 767
tie
J 0 ^0
ANSWERS
797
a. (•) J4 - «*■* + (b) K«e** - ««- +
(c) Hx-H+t-*- He-**.
Section 6, Pages 768-789
+ ( t cuL — g -(»/A)< _ oai cos u( + cR sin at with c *»
l e -(RIL) t
fi* + «*£*'
S. L~ l f V(r)«-< B /«<‘“ T )dr - R - 1 fV( t )[1 - *—€»/«<*— r>) (Jr
Jo JO
+ m u _^,
INDEX
The letter p. after a page number refers
Abel's theorem, on differential equations,
54 p.
on power series, 142
Absolute convergence, of integrals, 755
of series, 127, 170
Absolute value of complex numbers, 168,
528
Acceleration, normal, 313
tangential, 313
vector, 302 p. t 313
in cylindrical coordinates, 367p.
Adams' method, 723, 728
Addition, of complex numbers, 166, 529
of matrices, 327
parallelogram law of, 288, 529
of series, 117
of vectors, 288, 317
Adiabatic expansion, 50 p.
Algebraic equations, 677
solution of, by graphical methods, 678
by iterative methods, 679
systems of linear, 350, 687, 689, 749
Alternating series, 128
Amplitude, of simple harmonic motion, 44
of waves, 428
Analytic functions, 540
branch points of, 570
Cauchy's formula for, 555
Cauchy's theorem for, 547
differentiation of, 541, 557
essential singular points of, 570, 574
geometric representation of, 575
integrals of, 545, 547, 551
Laurent's expansion for, 565
mapping by, 575-594
maximum modulus theorem for, 558,
561
poles of, 570
residue theorem for, 573
to a problem, the letter n. to a footnote.
Analytic functions, residues of, 570
singular points of, 543, 569
Taylor's series for, 561
Angle, phase, 44, 528
solid, 399p.
Angular momentum, 305
Angular velocity, 302, 399p.
Antenna, radiation from, 486
Arc length, 301
in curvilinear coordinates, 362
of an ellipse, 147
Argand’s diagram, 528
Argument of a complex number, 528
Arithmetic means, 667
Asymptotic equality, 12, 123
Atmospheric pressure, 50
(See also Pressure)
Attraction, of a cone, 277 p.
Coulomb’s law of, 408n.
of a cylinder, 277p.
Newton's law of, 46
of a sphere, 47, 277 p., 410
Augmented matrix, 750
Average, arithmetic, 667
Average-value theorem, 498
Base or coordinate vectors, 319
in curvilinear coordinates, 363
in cylindrical coordinates, 367p.
orthonormal, 321
in spherical coordinates, 367p.
transformation of, 337, 367p.
Basis, 319
Beams, bending of, 15
buckling of, 95p.
cantilever, 16
on elastic foundations, 86p., 94p.
vibration of, 435p.
799
800
INDEX
Bending moment, 16, 435p.
Bernoulli's differential equation, 27
Bernoulli-Euler law, 16
Bessel's differential equation, 159
Bessel’s functions, 162, 198, 480
asymptotic formulas for, 199p
expansion in series of, 481
generating function for, 166p.
orthogonality of, 198
zeros of, 198
Bessel’s inequality, 202
Beta function, 149, 765
Biharmonic equation, 430p.
Bilinear forms, 349
Bilinear transformation, 577 p.
Binomial distribution, 639
generating function for, 6 4 Op.
Binomial frequency function, 639
Binomial law of probability, 639
Laplace’s approximation to, 647
normal approximation to, 647
Binomial theorem, 155
Binormal, 312
Boltzmann constant, 633
Boundary-value problems, 91, 442, 730
Bounds for Fourier coefficients, 211
Branch points, 570
Buffon’s needle problem, 637
Cable, flow of electricity in, 514
hanging under gravity, 40, 454
oscillations of, 445, 454
supporting roadway, 42p.
Calculus of variations, 264
isoperimetric problems, 269
problems with constraints, 269
Cantilever beam, 16
Cartesian reference frames, 321
Catenary, 41
Cauchy’s convergence criterion, 115
Cauchy’s differential equation, 78
Cauchy’s inequality, 322
Cauchy’s integral formula, 555
Cauchy’s integral test, 120
Cauchy’s integral theorem, 547
Cauchy’s principal value of an integral,
602
Cauchy-Riemann equations, 413
Oauchy-Schwarz inequality, 322
Center, of gravity, 275, 281
Center, of mass, 44, 303
motion of, 44, 304
Chain under gravity, 40
Chain rule, 228
Change of variables, in functions, 237
in integrals, 270
Channel, flow from, 594
Chaplygin’s method, 37
Characteristic equation, 54, 67, 521 , 733
of a matrix, 344
for systems of linear differential equa-
tions, 100, 106p., 733
Characteristic frequencies, 477, 479p., 481,
482p.
Characteristic functions, 732
Characteristic values, 344, 507p., 732
Characteristic vectors, 344
Characteristics, 440, 508, 517
discontinuities on, 519
Chemical combinations, 14
Chemical reactions, 15
Circle of convergence, 170, 562
Circulation, 397, 591
Clairaut’s equation, 27 p.
Cofactors, 741
Column, axially loaded, 86p., 90
Euler's critical load for, 92
Combinations, 611
Combinatory analysis, 611
Comparison tests, for integrals, 755n.
for series, 122, 125, 134
Complementary function, 59
Complex function, 534
continuity of, 540
differentiation of, 541
integration of, 543
Complex numbers, 166, 527
absolute value of, 168, 528
addition of, 166, 529
argument of, principal, 528
conjugate, 529
modulus of, 528
operations on, 528
phase angle of, 528
polar form of, 528
roots of, 531
Complex potential, 587
Complex roots of unity, 532
Complex variable, elementary function*
of, 534
Complex-variable theory, 523-604
Components of a vector, 291, 317, 321
INDEX
801
Composite functions, 280
Compound probability, 617, 635
Condenser discharge, 81, 84
Conductivity, thermal, 415, 455
Conformal mapping, 583, 598
examples of, 575-595
invariance of harmonic functions under,
585
Riemann’s theorem on, 586
Schwarx-Christoffel formula for, 599
Conjugate complex numbers, 529
Conjugate harmonic functions, 560, 588
Conjugate matrix, 343
Conservation, of energy, 43
of matter, 417
of momentum, 44
Conservative force fields, 408
Constraints in calculus of variations, 269
Continuity, 218
of complex functions, 540
equation of, 412, 417
piecewise, 755
of scalar functions, 368
of vector functions, 299, 368
Contour integrals (see Line integrals)
Convergence, circle of, 170, 562
interval of, 139
radius of, 139, 170, 562
of series ( see Series)
uniform, 132
Convolution, 488
Convolution theorem, 488, 762
Coordinate lines, 272, 359
Coordinate surfaces, 359
Coordinate vectors, 319
Coordinates, affine, 366
curvilinear, 357
divergence in, 406
gradient in, 407
volume in, 364
cylindrical ( see Cylindrical coordinates)
orthogonal, 363
parabolic, 408p.
spherical, 360, 367p.
Correlation coefficient, 666
Coulomb's law, 408n., 467
Couple, 305
Covariance, 664
Cramer's rule, 326, 749
Cross product, 294
Grout's reduction, 687».
Curl, 396
Curl, in cartesian coordinates, 398
in curvilinear coordinates, 406
relation to rotation, 399p.
Current flow, 416
in cables, 514
in electrical circuits, 81, 87, 100, 76 lp.,
768p.
Curvature, 150, 311
Curve, elastic, 16, 86p.
Frenet’s formulas for, 311
integral, 7
length of, 301
minimising, 265
of minimum descent, 269p., 767
motion on, 301, 313
normal to, 311
piecewise or sectionally smooth, 372
pursuit, 33
on a surface, 309
trihedral associated with, 312
Curve fitting, by finite differences, 694
by graphical means, 701
by least squares, 702
by trigonometric functions, 711
Curvilinear coordinates (see Coordinates)
Cycloid, motion on, 46 p., 269p., 767
Cylindrical coordinates, 359
acceleration components in, 367p.
base vectors in, 367p.
velocity components in, 367p.
volume element in, 365
D, 57
V, 370
V 2 , 387, 407
A, 510, 692
5, 761
D'Alembert’s solution of wave equation,
439, 485
Damped oscillations, 449
Damping, viscous, 82, 449
Definite integrals (see Integrals)
Deformation of contours, 549
Del, V (see Gradient)
de Moivre's formula, 530
Dependence, linear, 52, 70, 317
Derivative, directional, 243, 253, 369
(See also Gradient)
normal, 244, 253, 369
partial, 219
Determinants, 325, 741-758
Determinants, cof actors of, 741
differentiation of, 743
expansion of, 326, 741
minors of, 741
multiplication of, 325, 745
solution of equations by, 326, 748
Wronskian, 52, 54 p., 71
Difference equations, 510, 734
Dirichlet’s problem for, 511
elliptic, 511, 518
hyperbolic, 513, 518
parabolic, 512, 518
Difference operators, 510, 692
Differences, backward, 692
finite, method of, 734
forward, 691
Differential, 223, 310
approximations by, 226, 311
of arc length, 362
exact, 226, 380
total, 226, 234
of volume, 364
Differential equations, elliptic, 505, 511,
518
Euler’s, 78, 267
exact, 20
hyperbolic, 507
Lagrange’s, 26
ordinary (see Ordinary differential equa-
tions)
parabolic, 506, 512
partial (see Partial differential equa-
tions)
systems of, 95, 733
Differential form, quadratic, 362
Differential operators, 57, 430p.
Differentiability, 226
Differentiation, of analytic functions, 542
chain rule for, 228
of composite functions, 230
of definite integrals, 261
of determinants, 743
of Fourier series, 210
of implicit functions, 230, 235
of infinite series, 135
numerical, 698
partial, 219
of power series, 140
of vector functions, 299
' Diffusion, 14, 416, 463n.
Difftiflivity, 463n.
Dimensional analysis, 433
Dipole, 408p., 496
Dirac's delta function, 761
Dirac's distribution, 759
Direction cosines, 370
Directional derivative, 243, 253, 369
(See also Gradient)
Dirichlet’s conditions, 180
Dirichlet’s kernel, 205
Dirichlet’s problem, 467, 502, 611, 595
for arbitrary regions, 595
for a circle, 469
for a half plane, 484
for a half space, 503
Dirichlet's theorem, 180
Discontinuity, simple, 178
Discrete distributions, 627, 628
Discrete variables, 628
Dispersion, 427
Distribution function, 632
Distributions, binomial, 639, 640p.
bivariate, 666
continuous, 631
discrete, 627, 628
Gaussian, 633
Maxwell-Boltzmann's, 633
normal, 633, 651
Poisson’s, 633
Divergence, 384
in cartesian coordinates, 386
in curvilinear coordinates, 406
Divergence theorem, 388, 493
Dot product (see Scalar product)
Double layer, 497
Dummy or summation index, 324
Dynamics, laws of, 302
e, 655n.
e 4 *, 173, 536
e*, 172, 536, 581
Eigenfunction (characteristic function)
732
Eigenvalue (characteristic value), 344
5Q7p., 732
Eigenvector (characteristic vector), 344
Elastic curve, 16, 86p.
curvature of, 16
Elasticity, 599
Electric circuits, 81, 87, 100, 756
Electromechanical analogies, 81
Electron, acceleration of, 45, 46p.
mass-to-charge ratio, lOOp.
INDEX
Electrostatic field, 592, 594
Electrostatics, 457, 496
Ellipse, length of, 147
Elliptic differential equation, 505
difference equation for, 511, 518
Elliptic integrals, 49, 86p., 148
Emissivity, 463
Empirical formulas, 701
Energy, conservation of, 43, 48
kinetic, 43, 306p.
potential, 43, 264
Envelope, 35
Equation of continuity, 412, 417
Error function (probability integral),
table, 776
Errors, estimate of, 658, 660
Gauss’ law of, 662
mean-absolute, 662
mean-square, 662
probable, 662
in solving differential equations, 38,
104
theory of, 658
Essential singular points, 570, 574
Estimate, of errors, 658, 660
maximum likelihood, 640, 660
reliability of, 670
unbiased, 640
of variance, 669
Euclidean space, 321, 374n.
Euler's critical load, 92
Euler’s differential equation, 78
invariational calculus, 267
Euler’s formula, for exponentials, 173, 536
for Fourier coefficients, 175, 196
Euler’s hydrodynamical equations, 419
Euler's polygonal curves, 721, 727
Euler's theorem on homogeneous func-
tions, 234
Euler-Fourier formula, 175, 196
Even functions, 183
Fourier expansion for, 184
Events in probability, 610, 618, 619, 638
Exact differential, 226, 380
Exact differential equations, 20
Expansion, adiabatic, 50p.
of determinants, 326, 741
Fourier, 175, 196
Heaviside, 766
Laurent, 564, 565
Maclaurin, 144
in power series, 144
Expansion, in series of orthogonal funo*
tions, 201
Taylor, 144
Expectation, 623, 634
of product, 663
of sum, 624
Expected frequency, 627
Expected value, 623, 639
Exponential function, 172, 536, 581
Extrapolation formulas, 696
Extreme values, 250, 264
Extremum, 250
Factor, integrating, 22
Factorial, n !, approximation for, 644
Factorial function, 162, 755
(See also Gamma function)
Falling bodies, 47
Fermat’s principle, 264
Field, 367
conservative, 408
electrostatic, 467, 496, 592, 594
gravitational, 409, 467
irrotational, 402
solenoidal, 402
Field theory, 355-420
Finite differences, method of, 734
Flexural rigidity, 93
Fluid flow, 411, 416, 587-595
under dam, 593
ideal, 419, 587, 592
incompressible, 412, 418
irrotational, 412, 588
out of channel, 594
solenoidal, 412
stagnation points in, 590
steady, 412, 587
vortex in, 591
Flux, 384
Force field, 408
electrostatic, 467, 496, 592, 594
gravitational, 409
Forced vibrations, 86, 451
Fourier coefficients, 175, 196
bounds for, 211
Parseval's equality for, 202, 204p.
Fourier expansion, 175, 196
for odd functions, 185
Fourier heat equation, 414
Fourier integral equation, 192
Fourier integrals, 190, 194
804
INDEX
Fourier series, 175, 196
complex form of, 192
convergence of, 200, 204
differentiation of, 210
doable, 476
for even and odd functions, 184
extension of interval for, 187
integration of, 207
uniqueness theorem for, 186
Fourier transform, 194, 482-490
Free vibrations, 79, 432, 444, 446, 475
Frenet-Serret formulas, 312
Fresnel integrals, 147, 153p.
Frequency, characteristic, 477, 479p.,
482p.
relative, 615, 638, 642
resonant, 89, 477
Frequency equation, 478, 481
Frequency function, 627
binomial, 639
Fuchs’ theorem, 157
Fundamental theorem of integral calculus,
9, 261, 550
Gamma function, 149p., 162
Gas, ideal, 221
viscosity of, 451
Gauss’ distribution, 633
Gauss’ divergence theorem, 388
Gauss’ law of errors, 662
Gauss’ reduction method, 350n., 687
Gauss-Jordan reduction, 687n.
Gauss-Seidel method, 689n.
Geometric series, 115
Gradient, V, 244, 367, 390
in cartesian coordinates, 370
in curvilinear coordinates, 407
Graeffe’s root-squaring method, 679n.
Gram-Schmidt method, 351
Graphical solution of equations, 678
Gravitational attraction, 277p., 409
motion under, 47, 49p.
Gravitational constant, 46
Gravitational field, 407
Gravitational potential, 409
Gravity, center of, 275, 281
Gravity dam, 593
Greatest lower bound, 773
Green’s function, 501
for half space, 502
Green’s identities, 391p., 493
Green’s theorem, in plane, 391, 402p.
symmetric forms of, 391p., 493
Growth factor, 418
Harmonic analysis, 711
Harmonic function, 468, 560, 585
average value theorem for, 498
conjugate, 560, 588
differentiability of, 559
maximum values of, 499p., 506, 558, 561
Harmonics, 177
Heat capacity, 455
Heat equation, 414, 455
solution of, by integrals, 482
by separation of variables, 459
by series, 455-471
uniqueness of, 466, 506
Heat flow, 414, 455-467, 483, 504, 512
connection with random walks, 653
in a rod, 456-466, 489, 769
source function for, 491, 653
in a sphere, 471
Heat source, 489, 504, 653
Heaviside’s expansion theorem, 766
Helix, 314, 316p.
Helmholtz formula, 499
Hermitian form, 348
Hermitian matrix, 349
Holomorphic function, 543
Homogeneous differential equations (see
Ordinary differential equations)
Homogeneous functions, 18, 234
Euler’s theorem on, 234
Hooke’s law, 80
Horner’s method, 679n.
Hydrodynamics, 416, 419
(See also Fluid flow)
Hydrostatic pressure, 593
Hyperbolic differential equation, 507
difference equation for, 513
Hyperbolic functions, 537, 591
Hypergeometric equation, 165
Ideal fluid, 419
Images, method of, 448, 462p.
Implicit functions, differentiation of, 280,
235
Improper integrals (see Integrals)
Impulse function, 759
Indefinite integral, 551
INDEX
805
Independence, linear, 52, 70, 317
of path, 378, 393
Independent events in probability, 610,
618, 638
Indidal equation, 161
Inertia, moment of, 274
Infinite series (see Series)
Inner product (see Scalar product)
Integral calculus, fundamental theorem of,
9, 261, 550
Integral curve, 7
Integral equations, 767
Integrals, of analytic functions, 545, 547,
551
of Cauchy’s type, 557
change of variables in, 270
of complex functions, 545
contour (see Line integrals)
convergence of, absolute, 755
differentiation of, 261, 262
elliptic, 49, 86p., 148
evaluation of, by fundamental theorem,
9, 261
by numerical methods, 717
by residue theorem, 599
by series, 147
improper, 118, 553n., 602
principal value of, 602
indefinite,^ 551
Lebesgue, 774
line (see Line integrals)
mean-value theorem for, 380
multiple, 270
particular (see Particular integrals)
probability, 776
Riemann, 771
Stieltjes, 630
surface, 277, 373
transformation of, 382-402
volume, 374
Integrating factor, 22
Integration, numerical, 715
Interpolation, 679
Interpolation formulas, 696, 699
Interval, closed, 132n., 772n.
of convergence, 139
open, 218, 772n.
Inverse elementary functions, 539p., 540p.
Inversions, of matrices, 333, 350
of order, 742
Irrotationa] field, 402
Irrotational flow, 412
Isoclines, 36
Isolated singular points, 569
Iterative methods, 679, 684, 689, 721-730
( see Bessel’s functions)
Jacobian, 238, 242p., 271
Jump of a function, 520
K»(x) (see Bessel’s functions)
Kinetic energy, 43, 306p.
Kronecker delta, 321
Lagrange’s differential equation, 26
Lagrange’s interpolation formula, 699
Lagrange's multipliers, 250, 254
Laplace transform, 754r~769
bilateral, 762n.
convolution theorem for, 762
of derivatives, 756
of Dirac’s “function/' 759
Heaviside’s theorem on, 766
solution by, of differential equations,
756-762
of integral equations, 767
tables of, 770
unilateral, 762n.
Laplace’s difference equation, 511, 735
Laplace’s equation, 409, 413, 416, 419, 464,
467, 735
Laplace's law in probability, 647
Laplace-de Moivre limit theorem, 648
Laplacian operator, 387
in curvilinear coordinates, 407
Laurent’s expansion, 564
uniqueness of, 567
Laurent’s theorem, 565
Law, of errors, 662
of large numbers, 650
of mechanics, 302
Newton’s (see Newton's law)
parallelogram, 288, 529
of probability, binomial, 639, 647
normal, 647, 653
of reflection, 297 p.
of refraction, 297p.
of small numbers, 654
Least squares, 663p., 702
connection with orthogonal functions,
200
curve fitting by, 702
806
INDEX
Lebeegue integral, 774
Lebeague theorem, 775
Legendre polynomials, 150
expansion in series of, 196, 473
generating function for, 159p.
orthogonality of, 198
Rodrigues’ formula for, 159p.
Legendre's equation, 158
Leibniz' formula, for differentiation of
integrals, 262
use in evaluating integrals, 263
Leibniz’ test, 128
Length of arc, 301, 362
of an ellipse, 147
Level surface, 369
Line, equation of, 306
Line integrals, 373
of analytic functions, 547-554
in complex plane, 545
independent of path, 378, 393
transformation of, 382-402
of vector functions, 374
Linear algebraic equations, 350, 687, 689,
749
Linear dependence, 52, 70
of vectors, 317
Linear differential equations (see Ordinary
differential equations)
Linear fractional transformation (bilinear
transformation), 577 p.
Linear operators, 336, 754
linear transformation, 332
linear vector spaces, 316
Linearity, property of, 51
Lipschitz condition, 38
Log z, 537, 681
Logarithmic function, 537, 581
principal value of, 537
Lower bound, 773
M test, 134
Maclaurin’s formula, 260
Maclaurin's series, 144
Mapping, by analytic functions, 575-594
conformal (see Conformal mapping)
Mass, center of, 303
motion of, 44, 304
Matrices, algebraic operations on, 327-331
inversion of, 333, 350
product of, 328
transformation of, 340*350
Matrix, 327, 749
augmented, 750
characteristic equation of, 344
characteristic values of, 344
conjugate, 343
determinant, of, 750
diagonal, 329, 339, 343
Hermitian, 349
identity, 329
inverse of, 333
orthogonal, 340
rank of, 330, 750
scalar, 329
singular, 331
square, 327
symmetric, 347
transpose of, 334
unit, 329
unitary, 343 p., 350
zero, 329
Maxima and minima, 246
absolute, 247
constrained, 249, 269
relative, 247
(See also Calculus of variations)
Maximum modulus theorem, 558
Maximum principles, 506, 507
Maxwell-Boltzmann distribution, 633
Mean errors, 200, 659, 660
reliability of estimate of, 670
Mean-value theorem, of differential cal-
culus, 224
for integrals, 380
Measurable set, 773
Measure, 772
Measure numbers, 291, 319
Measure theory, 614, 772
Mechanics, laws of, 302
Median value, 637p.
Membrane, under gas pressure, 482p.
vibration, of circular, 480
of rectangular, 474
Metric coefficients, 360
Minima (see Maxima and minima)
Minimax, 249
Minimizing curve, 265
Minimum descent, curve of, 269p., 767
Minimum potential-energy principle, 264
Minors of a determinant, 741
Modes, 477, 481
Modulus of a complex number, 528
Moment, bending, 4 35p.
mass
807
Moment, of dipole, 497
of force, 303
of inertia, 274
of momentum, 305
Momentum, angular, 305
linear, 42, 44
moment of, 305
Momentum vector, 305
Monte Carlo methods, 652
Multiple integrals, 270
Multiply connected region, 383
Mutually exclusive events, 610, 619
Nabla or del, V, 370
Neighborhood of a point, 540
Neumann’s function, 503
for half plane, 504
Neumann’s problem, 503
Newtonian potential, 277 p. t 409
Newton’s interpolation formulas, 696
Newton's law, of attraction, 46, 409
of cooling, 461
of gravitation, 46, 408
of motion, 42, 43
Newton's method of solving equations, 684
Nodal lines, 477
Nodes, 429, 445
Normal, to a curve, 311
principal, 311
to a surface, 309, 369
Normal acceleration, 313
Normal derivatives, 244, 369
Normal distribution, 633, 651
bivariate, 666
Normal equations, 703
Normal law of probability, 647
interpretation of, 653
Normal line, 309
(See also Normal)
Numerical analysis, 673-736
Numerical differentiation, 698
Numerical integration, 715
Numerical solution of differential equa-
tions, 37, 721-736
Odd functions, 183
Fourier expansion for, 185
Operator, curl, 398, 407
A 57, 430p.
V, 370, 386, 407, 692
Operator, V 2 , 387
A, 510, 692
difference, 510, 692
div, 386, 406
Fourier transform, 482
Laplace, 387
Laplace transform, 754
linear, 336, 754
Order, of differential equations, 6, 29, 76,
425
interchange in partial differentiation,
221
inversions of, 742
reduction of, 29, 76
Ordinary differential equations, 1-106
Abel's theorem for, 54 p.
Bernoulli’s, 27
Bessel's, 159
boundary-value problems in, 91, 730
Cauchy’s, 78
Chaplygin's method for, 37
characteristic equation for, 54, 67, 101
Clairaut’s, 27p.
with constant coefficients, 54, 66, 100
of electric circuits, 81, 100, 761p.
Euler’s, 267
Euler-Cauchy’s, 78
exact, 20
existence and uniqueness theorems for,
5, 7, 38, 157
first-order, 17-50
linear, 23, 51, 59
Fuchs’ theorem on, 157
Gauss’ hypergeometric, 165p.
homogeneous, first-order, 18
linear, 51, 54, 59, 96
systems of, 100, 733
hypergeometric, 165
indici&l equations for, 161
initial-value problem for, 9, 90, 730
integral curves for, 7
integrals of, 7
integrating factors for, 22
integration between limits, 13
isoclines for, 36
Lagrange’s, 26
Legendre’s, 158
linear, complementary function for, 59a.
with constant coefficients, 54, 66
systems of, 95, 733
with variable coefficients, 51, 59,
70, 153
INDEX
808
Ordinary differential equations, order of,
6, 425
reduction of, 29, 76
with separable variables, 18
Solutions of, 6
general, 10, 52, 59, 67, 102
by Laplace transform, 756-768
linearly independent, 7, 59, 72, 76 p.
by numerical methods, 87, 721-736
particular, 7, 59, 72, 76 p.
by power series, 153
singular, 8, 34
stability of, 103, 105
uniqueness of, 7, 38, 157
systems of, 95, 110, 727
characteristic equation for, 101, 733
Origin, 288
Orthogonal coordinates, 363
Orthogonal curves, 31
Orthogonal matrices, 340
Orthogonal sets of functions, 195
completeness and closure of, 203
expansion in series of, 201
relation of least squares to, 202
Orthogonal trajectories, 30
Orthogonal transformations, 340
Orthogonal vectors, 319
Orthogonality weighted, 197
Orthogonalization, of matrices, 340, 350
of vectors, 320
Orthonormal functions, 195, 197
Orthonormal vectors, 320
Oscillations, of cable, 445, 454
damped, 449
period of, 44, 81, 84
of spring, 80, 82, 86, 88, 89
Osculating plane, 311
Parabolic coordinates, 408p.
Parabolic differential equation, 506
difference equation for, 512
Parabolic mirror, 33
Parallelogram law of addition, 288, 529
Parsev&l's equality, 202, 204p.
Partial differential equations, 5, 425-521
boundary conditions for, 443
canonical forms of, 517
characteristic values for, 507p.
characteristics for, 441, 521
of elliptic type, 504
of heat flow 414, 455
Partial differential equations, of hydrody-
namics, 416, 429
of hyperbolic type, 504, 516
of parabolic type, 504, 516
of potential theory, 409, 41 In.
solutions of, by Fourier transform,
482-490
fundamental, 521
by integrals, 482-504
by Laplace transform, 769 p.
numerical, 734
by series, 448-482
uniqueness of, 505-510
of vibrating membranes, 475, 480
of vibrating rods, 435p., 485p.
of vibrating string, 431, 484
of wave motion, 428
Partial differentiation, 219
interchange of order in, 221
Partial sum of series, 111
Particular integrals, 7
by method of undetermined coefficients,
59
by variation of parameters, 72
Pendulum, 48, 49, 85 p.
Period, of oscillations, 44, 81, 84
of pendulum, 49, 85 p.
of vibration, 443
of waves, 428
Permutations, 611
Phase of simple harmonic motion, 44
Picard’s method, 509
Piecewise continuity, 755
Piecewise smoothness, 206, 372
Plancherel’s theorem, 483n.
Plane, equation of, 306
osculating, 311
tangent, 277 n., 309
Point at infinity, 577
Point set, 773
measure of, 774
Point vortex, 591
Points, in n-dimensional space, 316
in sample space, 613
Poisson's distribution, 633
Poisson's equation, 268, 41 In., 495
uniqueness of solution of, 499p.
Poisson's formula, 495
for a circle, 470, 599 p.
for a half plane, 490
for a half space, 503
Poisson's law of probability, 654
INDEX
809
Poles of analytic functions, 570, 574
residues at, 571
simple, 570
Polynomial representation of data, 694,
702
Potential, complex, 587
electrostatic, 467, 496
gravitational, 409, 468
Newtonian, 277 p M 409
Potential energy, 43
principle of minimum, 264
Potential theory, 468
(, See also Dirichlet's problem; Neu-
mann's problem)
Power series, 138
Abel’s theorem on, 142
convergence of, absolute and uniform,
139
radius of, 142, 170, 562
differentiation and integration of, 140
evaluation of integrals by, 147
expansions m, 144, 561
multiplication of, 142
solution of differential equations by, 153
substitution in, 152
uniqueness of representation by, 141,
146
Precision constant, 662, 670
Pressure, atmospheric, 50
on gravity dam, 593
in star's interior, 48
Primitive, 551
Principal argument, 528
Principal normal, 31 1
Principal value, of improper integrals, 602
of log z, 537
Probability, 610, 614, 652
binomial law of, 639, 647
compound and total, theorems on, 617,
635
events in, 610, 618, 619, 638
Laplace's law in, 647
law of large numbers, 651
law of small numbers (Poisson's), 654
marginal, 625
normal law of, 647, 653
Probability density, 632, 634
Gauss', 633
joint, 634
Maxwell-Boltzmann’s, 633
Poisson's, 633
Probability integral, table of, 776
Probable error, 662
Probable value, 635
Product, of determinants, 325, 745
of matrices, 328
of vectors, 293, 295, 298
Projectiles, 50
Pulley, slipping of belt on, 1 1
Pursuit curves, 33
Pythagorean formula, 322
Quadratic forms, 347
differential, 362
positive definite, 349
Quadrature, 715
Radiation from antenna, 486
Radiation condition, 501
Radius of convergence, 139, 170, 562
Random molecular motions, 653n.
Random process, 622
Random variables, 623, 662
Random walks, 653
Rank of a matrix, 330, 750
Ratio test, 125
Rational number, 772
Reflection, law of, 297 p,
transformation of, 341
Refraction, law of, 297p.
Regions, bounded, 535
closed, 218, 535
connected, 383
multiply, 383, 535
simply, 383, 535
finite, 535
open, 218
regular, 383
Relative frequency, 615, 638, 642
Remainder in Taylor's series, 144
Residuals, 703
Residue theorem, 573
evaluation of real integrals by, 599
Residues, 571
Resonance, 86, 89, 453
Resonant frequency, 89, 477
Riemann function, 508
Riemann integral, 771
Riemann 's mapping theorem, 586
Rocket, motion of, 45, 46 p.
thrust, 45
Rodrigues’ formula, 159p.
810
INDEX
Roots of unity, 532
Rotation, of shaft, critical speed of, 94
transformation of, 341
velocity of, 399p.
Sample space, 613, 634
Scalar, 287
Scalar fields, 367
Scalar product, 293n., 319
Scalar triple product, 297
Schwarz’ inequality, 322
Schwarz-Christ off el mapping formula, 599
Seidel’s method, 689
Separation of variables, 18, 459, 732
Series, 111
addition of, 117
alternating, 128
basic properties of, 116
binomial, 155
Cauchy’s criterion for, 115
comparison tests for, 122, 125, 134
of complex terras, 169
convergence of, 112
absolute, 127, 170
conditional, 129
fundamental principle for, 114
in the mean, 200
pointwise, 204
tests for, comparison, 113, 122
integral, 118
Leibniz', 128
ratio, 125
uniform, 132, 136
Weierstrass test for, 134
differentiation of, 135
evaluation of integrals by, 147
Fourier (see Fourier series)
geometric, 115
harmonic, 113
integration of, 135
Laurent's, 565
Maclaurin’s, 144
multiplication of, 131
of orthogonal functions, 195
power (see Power series)
rearrangement of, 129
remainder in, 113
solution of differential equations by,
153, 465
sum of, 112
partial, 111
Series, Taylor’s, 144, 561
telescoping, 116p.
trigonometric (see Fourier aeries)
Shaft, critical speed of rotation, 94
Shearing load, 435p.
Significance level, 649
Similar transformations, 339
Similitude, principle of, 433
Simple closed curve, 383n.
Simple harmonic motion, 44
Simple pendulum, 48
Simply connected region, 383
Simpson’s rule, 717
Singular integral, 493
Singular points, 543, 569
essential, 570, 574
isolated, 569
Singular solutions of differential equa
tions, 34
Sink, 384, 591
Solenoidal field, 402
Solid angle, 399p.
Solution of differential equations («e<
Ordinary differential equations; Par
tial differential equations)
Sommerf eld’s radiation condition, 501
Sound, equation of propagation of, 420
velocity of, 435p.
Source, 384, 591
of heat, 489, 504, 653
Space, complex, 322
dimensionality of, 317
Euclidean, 321, 374n.
linear vector, 316
sample, 613, 634
Space curves (Frenet-Serret formulas), 315
Specific heat, 415
Spectral theory, 199
Spherical coordinates, 360, 367p.
Spring, oscillation of, 80, 82, 86, 88, 99
Stability, of columns, 91
of rotating shafts, 92
of solutions of differential equations,
103, 105
Stagnation points, 590
Standard deviation, 664, 669
of the mean, 667
unbiased estimate of, 670
Statistical hypothesis, 614
relation to significance level, 649
Steady-state solutions, 88, 765
Steady-state temperature, 457, 471
INDEX
811
Stieitjaa integral, 630
Stirling’s formula, 644
Stokes* theorem, 400, 40 2p.
Stream function, 413
Streamlines, 413, 414p,, 587
String, vibration of, 425, 431-454, 484
Sturm-Liouville theory, 199
Summation convention, 324
Superposition principle, 766
Surface, level, 369
normal to, 309, 369, 373n.
one-sided, 383
piecewise smooth, 277, 383
regular, 384
tangent plane to, 277 n., 309
two-sided, 277, 373n., 383
Surface integrals, 277, 373
Systems of equations, differential, 95, 733
linear algebraic, 350, 687, 689, 749
Tangent line, 301, 31 1
Tangent plane, 277 n., 309, 311
Tautochrone, 767
Taylor's formula, 143
approximations by. 149
for functions of several variables, 257
Taylor’s series, 144, 561
Telegrapher’s equation, 514
Thermal conductivity, 415, 455
Torque, 303
(See also Moment)
Torsion, 312
Total differential, 226, 234
Total probability, 618. 635
Transcendental equations, 677, 679. 684
Transforms, Fourier, 194
Laplace, 754
Trapezoidal rule, 717
Trigonometric functions, 172, 536
Undetermined coefficients, method of, 59
Uniform convergence, 132
Uniqueness, of representation, in Fourier
series, 186
in Laurent's series, 567
in power series, 141, 146
of solutions, of ordinary differential
equations, 7, 38, 157
of partial differential equations, 505-
510
Unit impulse, 759
Unit vectors (see Base vectors)
Unitary matrices, 343p., 350
Unitary transformations, 343p.
Value, absolute, of complex numbers, 168,
528
characteristic, 344, 507p., 732
expected, 623, 639
extreme, 250, 264
maximum, of harmonic function, 499p ,
506, 558, 561
median, 637p.
principal, of improper integrals, 602
of logarithmic function, 537
probable, 635
Variance, 664, 669
Variation, 265
of parameters, 72
Vector, velocity. 301, 367p.
Vector acceleration, 302p., 313, 307p.
Vector analysis, 285-361
( See also Vector field theory)
Vector field theory, 355-420
Vector functions, continuity of. 299, 368
line integrals of, 374
Vector product, 295
Vector spaces, 316, 323
Vectors, algebraic operations on, 288-298,
316-324
base or coordinate (see Base oi coordi-
nate vectors)
bound, 288
characteristic, 344
components of 291, 317, 321
continued products of, 297
coordinate, 319
differentiation of, 299
free, 288
linear dependence of, 317
magnitude of, 288
momentum, 305
orthogonal, 319
orthonormal, 320
parallelogram law for, 288
product of, 293, 295, 298
sliding, 288
unit, 319
zero, 289, 317
Velocity, angular, 302
of escape, 47
INDEX
8X2
Velocity, of rotation, 399 p.
of sound, 435p.
in wave motion, 434
Velocity potential, 413, 419, 584, 587
Velocity vector, 301, 367p,
Vibration, of beams, 435 p.
of membranes, 474, 480
period of, 443
of string, 425, 431-454, 484
Viscosity of gases, 450
Viscous damping, 82, 80, 87, 449
Volume integral, 270, 374
Vortex, 591
Wave equation, 428, 432, 484, 499, 508
with damping, 449, 452
solution of v D'Alembert's, 439, 485
Fourier integral, 484
Wave equation, solution of, Fourier series,
448
by separation of variables, 463p.
uniqueness of, 439, 442, 508
Wave front, 486, 519
Waves, 425, 436
amplitude of, 428
period of, 428
plane, 488
shock, 520
standing, 428
Weierstrass M test, 134
Work, 302, 306p., 409, 418
Wronskian determinant, 52, 54p., 71
Zero vector, 289, 317
Zonal harmonics (see Legendre poly-
nomials)
DELHI POLYTECHNIC
LIBRARY
* Cl ASS NO. $ f I
BOOK NO. H 4-7 R
ACCESSION NO. A ' €